<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Daniel Antal | Reprex</title><link>https://reprex-next.netlify.app/author/daniel-antal/</link><atom:link href="https://reprex-next.netlify.app/author/daniel-antal/index.xml" rel="self" type="application/rss+xml"/><description>Daniel Antal</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><image><url>https://reprex-next.netlify.app/author/daniel-antal/avatar_hud88ed22bc3c29040ee8bdf20c6cd6530_105602_270x270_fill_q75_lanczos_center.jpg</url><title>Daniel Antal</title><link>https://reprex-next.netlify.app/author/daniel-antal/</link></image><item><title>Learn R with Reprex</title><link>https://reprex-next.netlify.app/slides/learn-with-reprex/</link><pubDate>Fri, 07 Oct 2022 12:35:00 +0200</pubDate><guid>https://reprex-next.netlify.app/slides/learn-with-reprex/</guid><description>&lt;h1 id="big-data-creates-inequalities">Big Data Creates Inequalities&lt;/h1>
&lt;p>Only the largest corporations, best-endowed universities, and rich governments can afford data collection and processing capacities that are large enough to harness the advantages of AI.&lt;/p>
&lt;hr>
&lt;h2 id="slide-navigation">Slide navigation&lt;/h2>
&lt;p>Fullscreen: &lt;code>F&lt;/code>&lt;/p>
&lt;ul>
&lt;li>Next: &lt;code>️&amp;gt;&lt;/code> or &lt;code>Space&lt;/code> | Previous :️&lt;code>&amp;lt;&lt;/code>&lt;/li>
&lt;li>Start: &lt;code>Home&lt;/code> | Finish: &lt;code>End&lt;/code>&lt;/li>
&lt;li>Overview: &lt;code>Esc&lt;/code>| Speaker notes: &lt;code>S&lt;/code>&lt;/li>
&lt;li>Zoom: &lt;code>Alt + Click 🖱️&lt;/code>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="big-data-that-works-for-all">Big data that works for all&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:75%">No matter how big is the problem or how small is your team, `Reprex` fill your reports, dashboards, newsletters, books with data and its visualization.
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">Learn R with us: you can reduce the inequalities by joining the open source movement, learning to run open source software, ask for help, improve the tutorials, the documentation, and eventually learn to make the computer work for you.
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">Contributor Covenant: Participating in open source is often a highly collaborative experience. We’re encouraged to create in public view, and we’re incentivized to welcome contributions of all kinds from people around the world. This makes the practice of open source as much social as it is technical.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="get-inspired">Get Inspired&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://curators.dataobservatory.eu/inspiration.html" target="_blank" rel="noopener">Find more interesting and better data&lt;/a>: you don&amp;rsquo;t have to be a data scientist or write code to contribute to our projects.&lt;/li>
&lt;li>&lt;a href="https://data-feminism.mitpress.mit.edu/" target="_blank" rel="noopener">Data feminism&lt;/a>: Catherine D&amp;rsquo;Ignazio and Lauren Klein present a new way of thinking about data science and data ethics—one that is informed by intersectional feminist thought. Highly inspirational, free, open-source book.&lt;/li>
&lt;li>&lt;a href="https://rladies.org/" target="_blank" rel="noopener">RLadies&lt;/a> is a world-wide organization to promote gender diversity in the R community.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="contributor-covenant">Contributor Covenant&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:75%">We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_example_1.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_example_2.webp"
>
&lt;hr>
&lt;h2 id="run-code-from-tutorials">Run code from tutorials&lt;/h2>
&lt;p>&lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize.dataobservatory.eu&lt;/a>&lt;/br>
&lt;a href="https://retroharmonize.dataobservatory.eu/articles/retroharmonize.htmll" target="_blank" rel="noopener">🖱 Get started&lt;/a>&lt;/br>
[🖱️ Articles](&lt;a href="https://retroharmonize.dataobservatory.eu/articles/index.htm" target="_blank" rel="noopener">https://retroharmonize.dataobservatory.eu/articles/index.htm&lt;/a>&lt;/p>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_readme.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="github_issues_spotifyR.webp"
>
&lt;h2 id="find-help-ask-for-help-reprex">Find help, ask for help: reprex&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_tutorials.webp"
>
&lt;h2 id="documentation-for-better-tutorials">Documentation for better tutorials&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_testthat.webp"
>
&lt;h2 id="debugging-and-testing-code">Debugging and testing code&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_documentation.webp"
>
&lt;h2 id="contribute-to-documentation">Contribute to documentation&lt;/h2>
&lt;hr>
&lt;h2 id="r-is-a-functional-language">R is a functional language&lt;/h2>
&lt;ul>
&lt;li>R is both a statistical environment and a programming language&lt;/li>
&lt;li>R, the open source and further developed version of the S language, is mainly functional&lt;/li>
&lt;li>If you did a task at least twice, the 3rd time you better write a function script to keep doing it forever.&lt;/li>
&lt;li>Most of your effort will be to find a well-written function for your work&lt;/li>
&lt;li>If you cannot find a function, you will modify somebody else&amp;rsquo;s function, or eventually write your own&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_code.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="rmd_example.webp"
>
&lt;h2 id="r--yaml--markdown--web-ready">R + YAML + markdown = web ready&lt;/h2>
&lt;hr>
&lt;ul>
&lt;li>&lt;a href="https://learnxinyminutes.com/docs/yaml/" target="_blank" rel="noopener">Learn YAML in Y minutes&lt;/a>: tell the computer what you want to do with a document&lt;/li>
&lt;li>&lt;a href="https://rmarkdown.rstudio.com/authoring_basics.html" target="_blank" rel="noopener">R Markdown basics&lt;/a>: it is just a plain markdown that allows you to insert little R program &amp;lsquo;chunks&amp;rsquo;.&lt;/li>
&lt;li>&lt;a href="https://github.com/mundimark/awesome-markdown-editors" target="_blank" rel="noopener">Awesome markdown editors and pre-writers&lt;/a>: find a convenient tool&lt;/li>
&lt;li>&lt;a href="https://workspace.google.com/marketplace/app/docs_to_markdown/700168918607" target="_blank" rel="noopener">Google Docs to markdown&lt;/a>: practice by translating your Google Docs text to markdown. It is &lt;em>very&lt;/em> easy.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_website.webp"
>
&lt;h2 id="package-and-release-a-team-effort">Package and release: a team effort&lt;/h2>
&lt;hr>
&lt;h2 id="our-open-source-development-projects">Our open source development projects&lt;/h2>
&lt;p>🔢 &lt;a href="https://dataset.dataobservatory.eu/" target="_blank" rel="noopener">dataset&lt;/a>: Synchronize datasets with global knowledge hubs #️⃣ &lt;a href="https://statcodelists.dataobservatory.eu/" target="_blank" rel="noopener">statcodelists&lt;/a>: Make your data codes understood globally ♻️ &lt;a href="https://iotables.dataobservatory.eu/" target="_blank" rel="noopener">iotables&lt;/a>: Create economic or environmental impact assessments in any EU country 🌍 &lt;a href="https://regions.dataobservatory.eu/" target="_blank" rel="noopener">regions&lt;/a>: Create from raw survey data more granular statistics in any EU country ✅ &lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize&lt;/a>: Harmonize questions banks, recycle answers from past surveys ⏭️ &lt;a href="https://reprex.nl/#releases" target="_blank" rel="noopener">all in on one page&lt;/a>&lt;/p>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="create_with_reprex.webp"
>
&lt;h2 id="create-with-us">Create with us&lt;/h2>
&lt;hr>
&lt;h1 id="questions">Questions?&lt;/h1>
&lt;p>&lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">Email&lt;/a> | &lt;a href="https://keybase.io/team/reprexcommunity" target="_blank" rel="noopener">Keybase&lt;/a>&lt;/p>
&lt;p>LinkedIn: &lt;a href="https://www.linkedin.com/in/antaldaniel/" target="_blank" rel="noopener">Daniel Antal&lt;/a> - &lt;a href="https://www.linkedin.com/company/68855596" target="_blank" rel="noopener">Reprex&lt;/a> | &lt;a href="https://reprex.nl/" target="_blank" rel="noopener">Home&lt;/a>&lt;/p></description></item><item><title>Learn R with Reprex</title><link>https://reprex-next.netlify.app/slides/learnr-with-reprex/</link><pubDate>Fri, 07 Oct 2022 12:35:00 +0200</pubDate><guid>https://reprex-next.netlify.app/slides/learnr-with-reprex/</guid><description>&lt;h1 id="big-data-creates-inequalities">Big Data Creates Inequalities&lt;/h1>
&lt;p>Only the largest corporations, best-endowed universities, and rich governments can afford data collection and processing capacities that are large enough to harness the advantages of AI.&lt;/p>
&lt;hr>
&lt;h2 id="slide-navigation">Slide navigation&lt;/h2>
&lt;p>Fullscreen: &lt;code>F&lt;/code>&lt;/p>
&lt;ul>
&lt;li>Next: &lt;code>️&amp;gt;&lt;/code> or &lt;code>Space&lt;/code> | Previous :️&lt;code>&amp;lt;&lt;/code>&lt;/li>
&lt;li>Start: &lt;code>Home&lt;/code> | Finish: &lt;code>End&lt;/code>&lt;/li>
&lt;li>Overview: &lt;code>Esc&lt;/code>| Speaker notes: &lt;code>S&lt;/code>&lt;/li>
&lt;li>Zoom: &lt;code>Alt + Click 🖱️&lt;/code>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="big-data-that-works-for-all">Big data that works for all&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:75%">No matter how big is the problem or how small is your team, `Reprex` fill your reports, dashboards, newsletters, books with data and its visualization.
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">Learn R with us: you can reduce the inequalities by joining the open source movement, learning to run open source software, ask for help, improve the tutorials, the documentation, and eventually learn to make the computer work for you.
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">Contributor Covenant: Participating in open source is often a highly collaborative experience. We’re encouraged to create in public view, and we’re incentivized to welcome contributions of all kinds from people around the world. This makes the practice of open source as much social as it is technical.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="data-feminism">Data Feminism&lt;/h2>
&lt;hr>
&lt;h2 id="get-inspired">Get Inspired&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://curators.dataobservatory.eu/inspiration.html" target="_blank" rel="noopener">Find more interesting and better data&lt;/a>: you don&amp;rsquo;t have to be a data scientist or write code to contribute to our projects.&lt;/li>
&lt;li>&lt;a href="https://data-feminism.mitpress.mit.edu/" target="_blank" rel="noopener">Data feminism&lt;/a>: Catherine D&amp;rsquo;Ignazio and Lauren Klein present a new way of thinking about data science and data ethics—one that is informed by intersectional feminist thought. Highly inspirational, free, open-source book.&lt;/li>
&lt;li>&lt;a href="https://rladies.org/" target="_blank" rel="noopener">RLadies&lt;/a> is a world-wide organization to promote gender diversity in the R community.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="contributor-covenant">Contributor Covenant&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:75%">We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_example_1.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_example_2.webp"
>
&lt;hr>
&lt;h2 id="run-code-from-tutorials">Run code from tutorials&lt;/h2>
&lt;p>&lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize.dataobservatory.eu&lt;/a>&lt;/br>
&lt;a href="https://retroharmonize.dataobservatory.eu/articles/retroharmonize.htmll" target="_blank" rel="noopener">🖱 Get started&lt;/a>&lt;/br>
&lt;a href="https://retroharmonize.dataobservatory.eu/articles/index.html" target="_blank" rel="noopener">🖱️ Articles&lt;/a>&lt;/p>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_readme.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="github_issues_spotifyR.webp"
>
&lt;h2 id="find-help-ask-for-help-reprex">Find help, ask for help: reprex&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_tutorials.webp"
>
&lt;h2 id="documentation-for-better-tutorials">Documentation for better tutorials&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_testthat.webp"
>
&lt;h2 id="debugging-and-testing-code">Debugging and testing code&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_documentation.webp"
>
&lt;h2 id="contribute-to-documentation">Contribute to documentation&lt;/h2>
&lt;hr>
&lt;h2 id="r-is-a-functional-language">R is a functional language&lt;/h2>
&lt;ul>
&lt;li>R is both a statistical environment and a programming language&lt;/li>
&lt;li>R, the open source and further developed version of the S language, is mainly functional&lt;/li>
&lt;li>If you did a task at least twice, the 3rd time you better write a function script to keep doing it forever.&lt;/li>
&lt;li>Most of your effort will be to find a well-written function for your work&lt;/li>
&lt;li>If you cannot find a function, you will modify somebody else&amp;rsquo;s function, or eventually write your own&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_code.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="rmd_example.webp"
>
&lt;h2 id="r--yaml--markdown--web-ready">R + YAML + markdown = web ready&lt;/h2>
&lt;hr>
&lt;ul>
&lt;li>&lt;a href="https://learnxinyminutes.com/docs/yaml/" target="_blank" rel="noopener">Learn YAML in Y minutes&lt;/a>: tell the computer what you want to do with a document&lt;/li>
&lt;li>&lt;a href="https://rmarkdown.rstudio.com/authoring_basics.html" target="_blank" rel="noopener">R Markdown basics&lt;/a>: it is just a plain markdown that allows you to insert little R program &amp;lsquo;chunks&amp;rsquo;.&lt;/li>
&lt;li>&lt;a href="https://github.com/mundimark/awesome-markdown-editors" target="_blank" rel="noopener">Awesome markdown editors and pre-writers&lt;/a>: find a convenient tool&lt;/li>
&lt;li>&lt;a href="https://workspace.google.com/marketplace/app/docs_to_markdown/700168918607" target="_blank" rel="noopener">Google Docs to markdown&lt;/a>: practice by translating your Google Docs text to markdown. It is &lt;em>very&lt;/em> easy.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_website.webp"
>
&lt;h2 id="package-and-release-a-team-effort">Package and release: a team effort&lt;/h2>
&lt;hr>
&lt;h2 id="our-open-source-development-projects">Our open source development projects&lt;/h2>
&lt;p>🔢 &lt;a href="https://dataset.dataobservatory.eu/" target="_blank" rel="noopener">dataset&lt;/a>: Synchronize datasets with global knowledge hubs #️⃣ &lt;a href="https://statcodelists.dataobservatory.eu/" target="_blank" rel="noopener">statcodelists&lt;/a>: Make your data codes understood globally ♻️ &lt;a href="https://iotables.dataobservatory.eu/" target="_blank" rel="noopener">iotables&lt;/a>: Create economic or environmental impact assessments in any EU country 🌍 &lt;a href="https://regions.dataobservatory.eu/" target="_blank" rel="noopener">regions&lt;/a>: Create from raw survey data more granular statistics in any EU country ✅ &lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize&lt;/a>: Harmonize questions banks, recycle answers from past surveys ⏭️ &lt;a href="https://reprex.nl/#releases" target="_blank" rel="noopener">all in on one page&lt;/a>&lt;/p>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="create_with_reprex.webp"
>
&lt;h2 id="create-with-us">Create with us&lt;/h2>
&lt;hr>
&lt;h1 id="questions">Questions?&lt;/h1>
&lt;p>&lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">Email&lt;/a> | &lt;a href="https://keybase.io/team/reprexcommunity" target="_blank" rel="noopener">Keybase&lt;/a>&lt;/p>
&lt;p>LinkedIn: &lt;a href="https://www.linkedin.com/in/antaldaniel/" target="_blank" rel="noopener">Daniel Antal&lt;/a> - &lt;a href="https://www.linkedin.com/company/68855596" target="_blank" rel="noopener">Reprex&lt;/a> | &lt;a href="https://reprex.nl/" target="_blank" rel="noopener">Home&lt;/a>&lt;/p></description></item><item><title>Surveyharmonization</title><link>https://reprex-next.netlify.app/slides/surveyharmonization/</link><pubDate>Sun, 25 Sep 2022 12:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/slides/surveyharmonization/</guid><description>&lt;h1 id="survey-harmonization-workflow">Survey Harmonization Workflow&lt;/h1>
&lt;hr>
&lt;h2 id="controls">Controls&lt;/h2>
&lt;ul>
&lt;li>Next: &lt;code>Right Arrow&lt;/code> or &lt;code>Space&lt;/code>&lt;/li>
&lt;li>Previous: &lt;code>Left Arrow&lt;/code>&lt;/li>
&lt;li>Start: &lt;code>Home&lt;/code> | Finish: &lt;code>End&lt;/code>&lt;/li>
&lt;li>Overview: &lt;code>Esc&lt;/code> | Speaker notes: &lt;code>S&lt;/code> | Fullscreen: &lt;code>F&lt;/code>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="principles">Principles&lt;/h1>
&lt;ul>
&lt;li>Generic concept of surveying, i.e. examining and record the area and features of (an area of land) to construct a map, plan, or description.&lt;/li>
&lt;li>Structured data collection of the missing information, harmonization of knowledge.&lt;/li>
&lt;li>Reproducibility and not automation. On a small scale, anything can be done with &lt;code>Ctrl C + Ctrl V&lt;/code>. But it should be recorded, documented for future &lt;code>Ctrl C + Ctrl V&lt;/code>.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="surveyharmonies_activities.webp"
>
&lt;h2 id="timeline">Timeline&lt;/h2>
&lt;p>&lt;/br>&lt;/br>&lt;/br>&lt;/br>&lt;/br>&lt;/br>&lt;/br>&lt;/br>&lt;/br>&lt;/br>&lt;/br>&lt;/p>
&lt;hr>
&lt;h2 id="concepts">Concepts&lt;/h2>
&lt;ol>
&lt;li>We harmonize &lt;a href="https://surveyharmonies.reprex.nl/concepts.html" target="_blank" rel="noopener">knowledge concepts&lt;/a>. Because knowledge concepts are very abstract, the harmonization of concepts requires an iteration of desired output and questionnaire or form items, and it will be carried on throughout the project. The harmonization of concepts will allow us to link our survey data to pre-existing survey data, financial information, or any other source of information.&lt;/li>
&lt;/ol>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="surveyharmonies_kanban.webp"
>
&lt;hr>
&lt;h2 id="data-model">Data Model&lt;/h2>
&lt;ol>
&lt;li>Data modeling enables us to place the information we gain from existing sources, for example, by recycling pre-existing questionnaire items and answers to a knowledge graph together with our data. A knowledge graph is a more flexible, future-proof, generalized database that connects pre-existing information with new information.&lt;/li>
&lt;/ol>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="surveyharmonies_issues.webp"
>
&lt;hr>
&lt;h2 id="question-bank">Question Bank&lt;/h2>
&lt;ol start="3">
&lt;li>The &lt;a href="https://surveyharmonies.reprex.nl/questionbank.html" target="_blank" rel="noopener">questionnaire harmonization&lt;/a> includes the harmonization of the question or entry form label (&lt;code>In the past 12 months, how many times have you been to a concert&lt;/code>) and the response scale (&lt;code>1&lt;/code>, &lt;code>2&lt;/code>, &lt;code>Do not remember&lt;/code>, &lt;code>Decline to say&lt;/code>). The harmonization must be made with other knowledge concepts (i.e. the concept of the concert) and survey questionnaires or annual report information fields.&lt;/li>
&lt;/ol>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="surveyharmonies_quesitonbank.webp"
>
&lt;hr>
&lt;h2 id="translations">Translations&lt;/h2>
&lt;ol start="4">
&lt;li>We must be able to work with translators and standardized &lt;a href="https://surveyharmonies.reprex.nl/translations.html" target="_blank" rel="noopener">translated labels&lt;/a>. We must have the question bank ready by the end of October. Ensure that we do not use URIs but IRIs for identifying questionnaire items. Labesl are translated or localized. A generic &amp;lsquo;French&amp;rsquo; label is often unsuitable for French speakers in Belgium.&lt;/li>
&lt;/ol>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="surveyharmonies_translations.webp"
>
&lt;hr>
&lt;h2 id="fieldwork">Fieldwork&lt;/h2>
&lt;ol start="5">
&lt;li>We must carry out &lt;code>fieldwork&lt;/code>, i.e. surveying music-related problems. We will conduct the fieldwork with a cheap online tool (LimeSurvey or SurveyMonkey). The fieldwork will likely remain fully online or may contain a small, hybrid online interview element. The integration of fieldwork implementation is the least important task for us. Use whatever is convenient.&lt;/li>
&lt;/ol>
&lt;hr>
&lt;h2 id="code-and-save">Code and Save&lt;/h2>
&lt;ol start="6">
&lt;li>We must record the information into coded datasets that are saved into files. The success of the output harmonization will depend on the use of harmonized coding (we will use, whenever possible, SDMX code definitions, such as &amp;lsquo;F&amp;quot; = &amp;lsquo;female&amp;rsquo;) and the use of machine-readable, open, portable file formats. Potential users are small entities, and we will avoid the use of databases and favor the use of knowledge graphs instead.&lt;/li>
&lt;/ol>
&lt;hr>
&lt;h2 id="output-harmonization">Output harmonization&lt;/h2>
&lt;ol start="7">
&lt;li>We will harmonize the data, which means that we will join the coded answers considering the question labels, the value labels, and various forms of missing information across all languages (i.e., English or German versions of a question and answer options.)&lt;/li>
&lt;/ol>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="difficult_bills_slide.webp"
>
&lt;hr>
&lt;h2 id="presentation">Presentation&lt;/h2>
&lt;ol start="8">
&lt;li>We will report the harmonized information using graphic visualizations, tables placed into presentation slides, books or web pages. We should have the templates based on test data ready in January.&lt;/li>
&lt;/ol>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="surveyharmonies_items.webp"
>
&lt;hr>
&lt;h1 id="questions">Questions?&lt;/h1>
&lt;p>&lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">Email&lt;/a>&lt;/p>
&lt;p>LinkedIn: &lt;a href="https://www.linkedin.com/in/antaldaniel/" target="_blank" rel="noopener">Daniel Antal&lt;/a> - &lt;a href="https://www.linkedin.com/company/79286750" target="_blank" rel="noopener">Digital Music Observatory&lt;/a>&lt;/p></description></item><item><title>Reprex Open Collaboration NLAIC 2022</title><link>https://reprex-next.netlify.app/slides/reprex-nlaic-2022/</link><pubDate>Wed, 21 Sep 2022 18:00:00 +0200</pubDate><guid>https://reprex-next.netlify.app/slides/reprex-nlaic-2022/</guid><description>
&lt;section data-noprocess data-shortcode-slide
data-background-image="Reprex-NLAIC-2022-background.webp"
>
&lt;h1 id="reprex-is-looking-for-new-collaborations-within-the-nlaic">Reprex is looking for new collaborations within the NLAIC&lt;/h1>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="Reprex-NLAIC-2022-Reprex-Open.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="Reprex-NLAIC-2022-01.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="Reprex-NLAIC-2022-02.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="Reprex-NLAIC-2022-03.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="Reprex-NLAIC-2022-background.webp"
>
&lt;h1 id="questions">Questions?&lt;/h1>
&lt;p>&lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">Email&lt;/a> | &lt;a href="https://keybase.io/team/reprexcommunity" target="_blank" rel="noopener">Keybase&lt;/a>&lt;/p>
&lt;p>LinkedIn: &lt;a href="https://www.linkedin.com/in/antaldaniel/" target="_blank" rel="noopener">Daniel Antal&lt;/a> - &lt;a href="https://www.linkedin.com/company/68855596" target="_blank" rel="noopener">Reprex&lt;/a>&lt;/p></description></item><item><title>Reprex</title><link>https://reprex-next.netlify.app/slides/reprex-esg-pitch/</link><pubDate>Wed, 21 Sep 2022 16:00:00 +0200</pubDate><guid>https://reprex-next.netlify.app/slides/reprex-esg-pitch/</guid><description>&lt;h1 id="big-data-creates-inequalities">Big Data Creates Inequalities&lt;/h1>
&lt;p>Only the largest corporations, best-endowed universities, and rich governments can afford data collection and processing capacities that are large enough to harness the advantages of AI.&lt;/p>
&lt;hr>
&lt;h2 id="slide-navigation">Slide navigation&lt;/h2>
&lt;p>Fullscreen: &lt;code>F&lt;/code>&lt;/p>
&lt;ul>
&lt;li>Next: &lt;code>️&amp;gt;&lt;/code> or &lt;code>Space&lt;/code> | Previous :️&lt;code>&amp;lt;&lt;/code>&lt;/li>
&lt;li>Start: &lt;code>Home&lt;/code> | Finish: &lt;code>End&lt;/code>&lt;/li>
&lt;li>Overview: &lt;code>Esc&lt;/code>| Speaker notes: &lt;code>S&lt;/code>&lt;/li>
&lt;li>Zoom: &lt;code>Alt + Click 🖱️&lt;/code>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="big-data-that-works-for-all">Big data that works for all&lt;/h1>
&lt;p>Reprex: No matter how big is the problem or how small is your team, we fill your reports, dashboards, newsletters, books with data and its visualization.&lt;/p>
&lt;hr>
&lt;h2 id="connected-financial-and-sustainability-reporting-based-on-open-data">Connected financial and sustainability reporting based on open data&lt;/h2>
&lt;p>Eviota: We map your material impacts in your value chain and connect it with environmental or social data that is re-used from the public sector.&lt;/p>
&lt;hr>
&lt;h2 id="data-problems-reprex">Data problems: Reprex&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:95%">Most SMEs, and civil society organizations do not have a data scientist/engineer in their team, maybe not even an IT person or a HR professional to make such a hire.&lt;/p?
&lt;/li>
&lt;li>
&lt;p style="font-size:95%">When these organizations must solve novel problems, like connecting their financial accounts with environmental and social impact data or connecting to automated transaction systems (like in music), they need novel solutions that do not require managing a database within their organization.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="data-problems-eviota">Data problems: Eviota&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:95%">Most SMEs, and civil society organizations do not have a data scientist/engineer in their team, maybe not even an IT person or a HR professional to make such a hire.&lt;/p?
&lt;/li>
&lt;li>
&lt;p style="font-size:95%">To access green bank loans, insurance products, subsidies, or investments, or to keep track of their sustainability goals in line with the Paris Accord or gender equality plan, organizations must connect their accounting system to external environmental data. We connect their accounts with impact estimates from reliable scientific sources.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="data-problems-examples">Data problems (examples)&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:200px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="difficulty_bills_levels.jpg" height="130">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">The cost of questionnaire-based market research (survey) is increasing exponentially and offers mediocre results without an enormous question bank and harmonization with other surveys.(See &lt;a href="https://reprex.nl/data/surveys/" target="_blank" rel="noopener">🖱 blogpost&lt;/a>) &lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="photo-1490004047268-5259045aa2b4.jpg" height="130">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">Manual data acquisition is an error-prone and boring task for humans that requires many working hours (often not credited in consultancies, law firms, or research institutes.)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="Sisyphus_Bodleian_Library.png" height="130">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">Wrangling spreadsheet tables or word processor documents by people without data knowledge is the &lt;a href="https://reprex.nl/post/2021-07-08-data-sisyphus/" target="_blank" rel="noopener">🖱 data Sisyphus&lt;/a>.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="our-solution-reprex">Our solution: Reprex&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:85%">We create data ecosystems with the modernization of the EU/OECD/UN-endorsed 'data observatory' concept. Our data observatory 3.0 uses the knowledge graphs of the web of data.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:85%">We acquire and process data on a scale in our data observatories. We acquire and process data on a scale in our data observatories. Our approach significantly reduces the cost of data acquisition and opens invisible, reliable governmental and scientific data sources. We are currently building five observatories, and one of them is already mature enough to be considered for official EU recognition (serving the music industry).&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:85%">We provide applications, for example, our Eviota application, which connects financial accounts with environmental and social data, and crates reliable indicators and benchmarks for the requirements of the sustainable finance package. &lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="our-solution-eviota-non-financial">Our solution: Eviota (Non-Financial)&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:80%">We create data ecosystems with the modernization of the EU/OECD/UN-endorsed 'data observatory' concept. Our Green Deal Data Observatory uses the knowledge graphs of the web of data and gives access to reliable, often unseen, hard-to-access ESG data sources.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:85%">We acquire and process data on a scale in our data observatories. Our approach significantly reduces the cost of data acquisition and opens invisible, reliable governmental and scientific data sources.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:85%">We provide applications, for example, our Eviota application, which connects financial accounts with environmental and social data, and crates reliable indicators and benchmarks for the requirements of the sustainable finance package. Unlike our competitors, we can serve SMEs, too, at a competitive cost.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="our-solution-eviota-for-banks">Our solution: Eviota (For Banks)&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:70%">We create data ecosystems with the modernization of the EU/OECD/UN-endorsed 'data observatory' concept. Our Green Deal Data Observatory uses the knowledge graphs of the web of data and gives access to reliable, often unseen, hard-to-access ESG data sources.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:70%">We acquire and process data on a scale in our data observatories. Our approach significantly reduces the cost of data acquisition and opens invisible, reliable governmental and scientific data sources.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:70%">We provide applications, for example, our Eviota application, which connects financial accounts with environmental and social data, and crates reliable indicators and benchmarks for the requirements of the sustainable finance package. We are validating our product in the regulatory sandbox of a central bank to show that we provide a cost-effective solution to many regulatory problems opened by the new [sustainable finance package of the EU](https://finance.ec.europa.eu/publications/sustainable-finance-package_en).&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="uniiq">UNIIQ&lt;/h2>
&lt;ol start="5">
&lt;li>Objectives, including product roadmap (technology/time/money)&lt;/li>
&lt;li>Schematic overview of developments since inception&lt;/li>
&lt;/ol>
&lt;hr>
&lt;h2 id="market">Market&lt;/h2>
&lt;hr>
&lt;h2 id="competition">Competition&lt;/h2>
&lt;hr>
&lt;hr>
&lt;h2 id="know-how-and-integration-of-open-source-components">Know-how and integration of open source components&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:85%">Reprex has a special know-how to map and connect private datasets managing the boundaries of organizations that often have conflicting interests. Our know-how was developed over 10 years, and the data of about 60, often conflicting music industry actors in 12 countries.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:85%">Our team has many years of experience with working public sector information reuse, or 'open data', and have built reliable open source software to process legally open, not readily downloadable, and very valuable information that is not available for market vendors.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:85%">We use RDF (linked open data) and other technologies to link scattered small data to big data; we use our own R libraries to test and process various data into reliable statistical data or indicators.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:85%">Based on our unique data access and software we are developing the Eviota App to connect financial accounts and environmental, social and governance data.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="team-rewrite-with-gdo">Team [rewrite with GDO]&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:200px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="reprex_contributors_20220920_2_1.png" width="200">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">The two co-founders, &lt;a href="https://reprex.nl/authors/daniel_antal/" target="_blank" rel="noopener">🖱 Daniel Antal, CFA&lt;/a> and &lt;a href="https://reprex.nl/authors/andres/" target="_blank" rel="noopener">🖱 Andrés García Molina, PhD&lt;/a>, and the core team manage the ecosystems&amp;rsquo; development, develop knowledge management, and direct the software development. &lt;a href="https://reprex.nl/#team" target="_blank" rel="noopener">🖱 Team on full screen&lt;/a>&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="dmo_contributors_20220920_2_1.png" width="200">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">Each observatory has a broader team of users, data and knowledge curators, and developers. The most developed &lt;a href="https://music.dataobservatory.eu/#contributors" target="_blank" rel="noopener">🖱️ Digital Music Observatory&lt;/a> has 16 institutional users and a team of about 20 music and data professionals. The newer observatories have a smaller, initial service development and data curatorial team.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="timeline">Timeline&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Inception: Yes!Delft AI+Blockchain Product Market Fit Validation with the Digital Music Observatory&lt;/p>
&lt;/li>
&lt;li>
&lt;p>New observatory development started with computational antitrust, ESG reporting, and&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Several, peer reviewed software releases&lt;/p>
&lt;/li>
&lt;li>
&lt;p>DMO has more than 20 curators, 3 million euro budget for 3 years, increasing user base.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Eviota and the Green Deal&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;ul>
&lt;li>
&lt;p>We are part of rOpenGov and have access to very special knowledge working with national accounts data and ESG data used by governments to keep track with the Paris Accord. We can access data cheaper, faster, better than our competitors.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>We have a know-how to manage conflicts of interest and very complex data use rights.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="data-observatories-30">Data observatories 3.0&lt;/h2>
&lt;p style="font-size:90%">Reprex is offering shared data ecosystems. Our observatories are great solutions for organizations without a data specialization:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">🌳 Organizations that cannot afford to build a large enough data team to sustain consistent, extensive data collection and processing (many large institutions and companies)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">🪴 Who cannot hire even a single data engineer or a data scientist (medium-sized companies, NGOs)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">🌱 Who do not even have a permanent IT function (about 2 million European small enterprises and civil society organizations)&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:200px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="observatory_collage_3x2_800.png" height="140">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:60%">The European Union, the World Bank, OECD, and UN have facilitated the creation of more than 80 so-called &amp;lsquo;data observatories&amp;rsquo; to help companies, researchers, NGOs, and governments systematically collect data and knowledge.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="dmo_opening_page_20220920_16x9.png" height="140">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:60%">We are currently building one prototype for the European Music Observatory financed by the European Union and music industry players (cc 3-4 million euros.) We would like to take over existing or start new observatories in 2 years at least 5)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="gold_panning_slide_notitle.png" height="140">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:60%">Our observatories are competitive, because they use high-quality open source scientific software; they exploit the new Data Governance Act and Open Data Directive, deploy web 3.0 data synchronization, and offer great value-added research products.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Platform products&lt;/th>
&lt;th>Value added data applications&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;p style="font-size:65%">The European Union, the World Bank, OECD, and UN have facilitated the creation of more than 80 so-called &amp;lsquo;data observatories&amp;rsquo; to help companies, researchers, NGOs, and governments systematically collect data and knowledge.&lt;/p>&lt;/td>
&lt;td>&lt;p style="font-size:65%">The different observatories offer different types of knowledge products, such as statistical yearbooks, various apps, and database access.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;p style="font-size:65%">Most of them use web 1.0 technologies, inefficient knowledge accumulation. Already 20 of them have been discontinued.&lt;/td>
&lt;td>&lt;p style="font-size:65%">We are developing software solutions that exploit our platforms: we harmonize surveys, statistical data, automate research reporting, elements of market monitoring or ESG reporting.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;p style="font-size:65%"> We are currently building one prototype for the European Music Observatory financed by the European Union and music industry players (cc 3-4 million euros.) We would like to take over existing or start new observatories in 2 years at least 5) &lt;/p>&lt;/td>
&lt;td>&lt;p style="font-size:65%">Each observatory gives us intimidate customer access to 3-4 large universities, 1-2 large consultancies, and various specialist institutions. &lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="marketing-strategy">Marketing strategy&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:160px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="dmo_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Buma/Stemra like copyright management agencies, music export offices, festivals and venues, University of Amsterdam, Sant’Anna, Economic University of Bratislava, ministries of culture, grant agencies.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="ccsi_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">University of Amsterdam, Europeana, Sant’Anna, Hungarian Film Fund&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="gdo_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Connected financial and sustainability reporting: bank consultancies, big four audit companies, large environmental NGOs.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="cdo_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Antitrust agencies, law firms, economics consultancies working with mergers and other competition related issues.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="target-market-size">Target market size&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th style="text-align:left">&lt;div style="width:400px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;div style="width:400px">&lt;/div>&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:55%">The observatory platforms usually have a build-up cost of about 3-5 million euros and an annual running costs of 0.1-3 million euros.&lt;/p>&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">We hope to gain at least 10% global market share on the observatory platform management market to pay our basic data science team and R&amp;amp;D. &lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:55%"> Our existing observatories give us access to the market and public surveying markets (cc € 30-40 bn in the developed nations), particularly to its software component (€ 10 billion euros). &lt;/p>&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">&lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize&lt;/a> integrates pre-existing questionnaire-based surveys and new surveys. We see interest from the biggest global players. &lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Our existing observatories gave us access to environmental impact assessment and currently we build an ESG reporting tool with a central bank, a value bank, and a big four company. &lt;/p>&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Connected ESG reporting has a €4 bn market in the EU alone, and our &lt;a href="https://reprex.nl/apps/eviota/" target="_blank" rel="noopener">Eviota product&lt;/a> is very competitive. Due to regulatory pressure, we can harvest a decent share if we are able to attract venture capital. &lt;p/>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="team">Team&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:200px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="reprex_contributors_20220920_2_1.png" width="200">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">The two co-founders, &lt;a href="https://reprex.nl/authors/daniel_antal/" target="_blank" rel="noopener">🖱 Daniel Antal, CFA&lt;/a> and &lt;a href="https://reprex.nl/authors/andres/" target="_blank" rel="noopener">🖱 Andrés García Molina, PhD&lt;/a>, and the core team manage the ecosystems&amp;rsquo; development, develop knowledge management, and direct the software development. &lt;a href="https://reprex.nl/#team" target="_blank" rel="noopener">🖱 Team on full screen&lt;/a>&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="dmo_contributors_20220920_2_1.png" width="200">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">Each observatory has a broader team of users, data and knowledge curators, and developers. The most developed &lt;a href="https://music.dataobservatory.eu/#contributors" target="_blank" rel="noopener">🖱️ Digital Music Observatory&lt;/a> has 16 institutional users and a team of about 20 music and data professionals. The newer observatories have a smaller, initial service development and data curatorial team.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="traction">Traction&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">💻 Our free scientific software products have a steadily growing user base (several thousand users globally.) &lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">📈 We are able to convert this to paying research automation services at a higher growth rate.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">🚀 We won four competitive tenders this year, but we feel that the slow tendering/acquisition/cash cycle is hampering our growth, we see far more opportunities that we can serve. Therefore we are looking for investors.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="funding">Funding&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%"> We have a good track record in EU tenders, but we would like to build up this reputation in the Netherlands, too, mainly for new platforms.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">We help our non-profit users, such as cultural heritage organizations, music export offices, collective rights management agencies to get funding to use our platforms and services&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">Our for profit-users need a more polished, user-friendlier front-end. Some are interested in joint ventures (like exploiting our survey capabilities). Venture capital would be preferred, as demand outstrips growth.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;!---
## Pool and take over work where humans fail
- The cost of questionnaire-based market research (survey) is increasing exponentially and offers mediocre results without an enormous question bank and harmonization with other surveys.
- Manual data acquisition is an error-prone and boring task for humans that requires many working hours (often not credited in consultancies, law firms, or research institutes.)
- Wrangling spreadsheet tables or word processor documents by people without data knowledge is the data Sisyphus.
---
## Open source software and open platform
- Our survey harmonization tool offers hundreds of thousands of answers for your questionnaire item from dozens of countries and many years. We reduce the market research cost while exponentially increasing its value with data harmonization.
- We use automated statistical software or web 3.0 technology to synchronize data automatically with our client's database, dashboard, or spreadsheet.
- Our observatories automate repetitive processing tasks like re-formatting, currency translation, measurement units, documentation, bibliography, and hypertext link management with many computerized 'unit tests.' We let the computer do the work where humans often make errors or remain hopelessly slow.
---
## Shared evidence ecosystems: data observatories
- Organizations that cannot afford to build a large enough data team to sustain consistent, extensive data collection and processing (many large institutions and companies)
- Who cannot hire even a single data engineer or a data scientist
- Who do not even have a permanent IT function (about 2 million European small enterprises and civil organizations)
---
--->
&lt;section data-noprocess data-shortcode-slide
data-background-image="contest-hague-award-2022.webp"
>
&lt;hr>
&lt;!---
&lt;div class="r-stack">
&lt;img class="fragment fade-out" data-fragment-index="0" src="https://placekitten.com/450/300" width="450" height="300">
&lt;img class="fragment current-visible" data-fragment-index="0" src="https://placekitten.com/300/450" width="300" height="450">
&lt;img class="fragment" src="https://placekitten.com/400/400" width="400" height="400">
&lt;/div>
---
&lt;div class="r-stack">
&lt;img class="fragment" src="https://placekitten.com/450/300" width="450" height="300">
&lt;img class="fragment" src="https://placekitten.com/300/450" width="300" height="450">
&lt;img class="fragment" src="https://placekitten.com/400/400" width="400" height="400">
&lt;/div>
---
## What are data observatories?
- There are more than 60 functional, and about 20 already discontinued data observatories, i.e. long-term, usually triangular (business, academic, policy) data collection institutions recognized by the EU, OECD or UNESCO, including the [European Observatory on Infringements of Intellectual Property Rights](https://single-market-economy.ec.europa.eu/industry/strategy/intellectual-property/enforcement-intellectual-property-rights/european-observatory-infringements-intellectual-property-rights_en#:~:text=The%20European%20Observatory%20on%20Infringements,countries%2C%20businesses%20and%20civil%20society.) of the EU or the [European Audiovisual Observatory](https://www.obs.coe.int/en/web/observatoire) of the Council of Europe.
---
--->
&lt;h2 id="do-it-smarter">Do it Smarter&lt;/h2>
&lt;ul>
&lt;li>They usually do not exchange standard data with statistical agencies, they are not synchronized on knowledge graphs of the Europeana or national libraries, and their research output is usually not to be found on open science repositories.&lt;/li>
&lt;li>The Hague is the winner of the &lt;a href="https://thehague.com/businessagency/the-hague-the-winner-world-smart-city-award-2021" target="_blank" rel="noopener">World Smart City Award 2021&lt;/a>, and we would like to attract the planned European Music Observatory and other, EU/UNESCO recognized institutions into the town building on the innovations of Reprex and the ecosystem of the Hague.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="questions">Questions?&lt;/h1>
&lt;p>&lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">Email&lt;/a> | &lt;a href="https://keybase.io/team/reprexcommunity" target="_blank" rel="noopener">Keybase&lt;/a>&lt;/p>
&lt;p>LinkedIn: &lt;a href="https://www.linkedin.com/in/antaldaniel/" target="_blank" rel="noopener">Daniel Antal&lt;/a> - &lt;a href="https://www.linkedin.com/company/68855596" target="_blank" rel="noopener">Reprex&lt;/a>&lt;/p>
&lt;p>&lt;a href="https://reprex.nl/" target="_blank" rel="noopener">Home&lt;/a> - &lt;a href="https://reprex.nl/talk/impactcity-startup-support-xl/" target="_blank" rel="noopener">One Pager ImpactCity Startup Support XL&lt;/a>&lt;/p></description></item><item><title>Reprex</title><link>https://reprex-next.netlify.app/slides/reprex-impactcity/</link><pubDate>Wed, 21 Sep 2022 16:00:00 +0200</pubDate><guid>https://reprex-next.netlify.app/slides/reprex-impactcity/</guid><description>&lt;h1 id="big-data-creates-inequalities">Big Data Creates Inequalities&lt;/h1>
&lt;p>Only the largest corporations, best-endowed universities, and rich governments can afford data collection and processing capacities that are large enough to harness the advantages of AI.&lt;/p>
&lt;hr>
&lt;h2 id="slide-navigation">Slide navigation&lt;/h2>
&lt;p>Fullscreen: &lt;code>F&lt;/code>&lt;/p>
&lt;ul>
&lt;li>Next: &lt;code>️&amp;gt;&lt;/code> or &lt;code>Space&lt;/code> | Previous :️&lt;code>&amp;lt;&lt;/code>&lt;/li>
&lt;li>Start: &lt;code>Home&lt;/code> | Finish: &lt;code>End&lt;/code>&lt;/li>
&lt;li>Overview: &lt;code>Esc&lt;/code>| Speaker notes: &lt;code>S&lt;/code>&lt;/li>
&lt;li>Zoom: &lt;code>Alt + Click 🖱️&lt;/code>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="data-problems">Data problems&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:200px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="difficulty_bills_levels.jpg" height="130">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">The cost of questionnaire-based market research (survey) is increasing exponentially and offers mediocre results without an enormous question bank and harmonization with other surveys.(See &lt;a href="https://reprex.nl/data/surveys/" target="_blank" rel="noopener">🖱 blogpost&lt;/a>) &lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="photo-1490004047268-5259045aa2b4.jpg" height="130">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">Manual data acquisition is an error-prone and boring task for humans that requires many working hours (often not credited in consultancies, law firms, or research institutes.)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="Sisyphus_Bodleian_Library.png" height="130">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">Wrangling spreadsheet tables or word processor documents by people without data knowledge is the &lt;a href="https://reprex.nl/post/2021-07-08-data-sisyphus/" target="_blank" rel="noopener">🖱 data Sisyphus&lt;/a>.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="data-observatories-30">Data observatories 3.0&lt;/h2>
&lt;p style="font-size:90%">Reprex is offering shared data ecosystems. Our observatories are great solutions for organizations without a data specialization:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">🌳 Organizations that cannot afford to build a large enough data team to sustain consistent, extensive data collection and processing (many large institutions and companies)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">🪴 Who cannot hire even a single data engineer or a data scientist (medium-sized companies, NGOs)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">🌱 Who do not even have a permanent IT function (about 2 million European small enterprises and civil society organizations)&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:200px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="observatory_collage_3x2_800.png" height="140">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:60%">The European Union, the World Bank, OECD, and UN have facilitated the creation of more than 80 so-called &amp;lsquo;data observatories&amp;rsquo; to help companies, researchers, NGOs, and governments systematically collect data and knowledge.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="dmo_opening_page_20220920_16x9.png" height="140">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:60%">We are currently building one prototype for the European Music Observatory financed by the European Union and music industry players (cc 3-4 million euros.) We would like to take over existing or start new observatories in 2 years at least 5)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="gold_panning_slide_notitle.png" height="140">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:60%">Our observatories are competitive, because they use high-quality open source scientific software; they exploit the new Data Governance Act and Open Data Directive, deploy web 3.0 data synchronization, and offer great value-added research products.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Platform products&lt;/th>
&lt;th>Value added data applications&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;p style="font-size:65%">The European Union, the World Bank, OECD, and UN have facilitated the creation of more than 80 so-called &amp;lsquo;data observatories&amp;rsquo; to help companies, researchers, NGOs, and governments systematically collect data and knowledge.&lt;/p>&lt;/td>
&lt;td>&lt;p style="font-size:65%">The different observatories offer different types of knowledge products, such as statistical yearbooks, various apps, and database access.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;p style="font-size:65%">Most of them use web 1.0 technologies, inefficient knowledge accumulation. Already 20 of them have been discontinued.&lt;/td>
&lt;td>&lt;p style="font-size:65%">We are developing software solutions that exploit our platforms: we harmonize surveys, statistical data, automate research reporting, elements of market monitoring or ESG reporting.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;p style="font-size:65%"> We are currently building one prototype for the European Music Observatory financed by the European Union and music industry players (cc 3-4 million euros.) We would like to take over existing or start new observatories in 2 years at least 5) &lt;/p>&lt;/td>
&lt;td>&lt;p style="font-size:65%">Each observatory gives us intimidate customer access to 3-4 large universities, 1-2 large consultancies, and various specialist institutions. &lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="marketing-strategy">Marketing strategy&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:160px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="dmo_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Buma/Stemra like copyright management agencies, music export offices, festivals and venues, University of Amsterdam, Sant’Anna, Economic University of Bratislava, ministries of culture, grant agencies.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="ccsi_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">University of Amsterdam, Europeana, Sant’Anna, Hungarian Film Fund&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="gdo_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Connected financial and sustainability reporting: bank consultancies, big four audit companies, large environmental NGOs.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="cdo_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Antitrust agencies, law firms, economics consultancies working with mergers and other competition related issues.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="target-market-size">Target market size&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th style="text-align:left">&lt;div style="width:400px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;div style="width:400px">&lt;/div>&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:55%">The observatory platforms usually have a build-up cost of about 3-5 million euros and an annual running costs of 0.1-3 million euros.&lt;/p>&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">We hope to gain at least 10% global market share on the observatory platform management market to pay our basic data science team and R&amp;amp;D. &lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:55%"> Our existing observatories give us access to the market and public surveying markets (cc € 30-40 bn in the developed nations), particularly to its software component (€ 10 billion euros). &lt;/p>&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">&lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize&lt;/a> integrates pre-existing questionnaire-based surveys and new surveys. We see interest from the biggest global players. &lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Our existing observatories gave us access to environmental impact assessment and currently we build an ESG reporting tool with a central bank, a value bank, and a big four company. &lt;/p>&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Connected ESG reporting has a €4 bn market in the EU alone, and our &lt;a href="https://reprex.nl/apps/eviota/" target="_blank" rel="noopener">Eviota product&lt;/a> is very competitive. Due to regulatory pressure, we can harvest a decent share if we are able to attract venture capital. &lt;p/>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="team">Team&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:200px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="reprex_contributors_20220920_2_1.png" width="200">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">The two co-founders, &lt;a href="https://reprex.nl/authors/daniel_antal/" target="_blank" rel="noopener">🖱 Daniel Antal, CFA&lt;/a> and &lt;a href="https://reprex.nl/authors/andres/" target="_blank" rel="noopener">🖱 Andrés García Molina, PhD&lt;/a>, and the core team manage the ecosystems&amp;rsquo; development, develop knowledge management, and direct the software development. &lt;a href="https://reprex.nl/#team" target="_blank" rel="noopener">🖱 Team on full screen&lt;/a>&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="dmo_contributors_20220920_2_1.png" width="200">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">Each observatory has a broader team of users, data and knowledge curators, and developers. The most developed &lt;a href="https://music.dataobservatory.eu/#contributors" target="_blank" rel="noopener">🖱️ Digital Music Observatory&lt;/a> has 16 institutional users and a team of about 20 music and data professionals. The newer observatories have a smaller, initial service development and data curatorial team.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="traction">Traction&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">💻 Our free scientific software products have a steadily growing user base (several thousand users globally.) &lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">📈 We are able to convert this to paying research automation services at a higher growth rate.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">🚀 We won four competitive tenders this year, but we feel that the slow tendering/acquisition/cash cycle is hampering our growth, we see far more opportunities that we can serve. Therefore we are looking for investors.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="funding">Funding&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%"> We have a good track record in EU tenders, but we would like to build up this reputation in the Netherlands, too, mainly for new platforms.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">We help our non-profit users, such as cultural heritage organizations, music export offices, collective rights management agencies to get funding to use our platforms and services&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">Our for profit-users need a more polished, user-friendlier front-end. Some are interested in joint ventures (like exploiting our survey capabilities). Venture capital would be preferred, as demand outstrips growth.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;!---
## Pool and take over work where humans fail
- The cost of questionnaire-based market research (survey) is increasing exponentially and offers mediocre results without an enormous question bank and harmonization with other surveys.
- Manual data acquisition is an error-prone and boring task for humans that requires many working hours (often not credited in consultancies, law firms, or research institutes.)
- Wrangling spreadsheet tables or word processor documents by people without data knowledge is the data Sisyphus.
---
## Open source software and open platform
- Our survey harmonization tool offers hundreds of thousands of answers for your questionnaire item from dozens of countries and many years. We reduce the market research cost while exponentially increasing its value with data harmonization.
- We use automated statistical software or web 3.0 technology to synchronize data automatically with our client's database, dashboard, or spreadsheet.
- Our observatories automate repetitive processing tasks like re-formatting, currency translation, measurement units, documentation, bibliography, and hypertext link management with many computerized 'unit tests.' We let the computer do the work where humans often make errors or remain hopelessly slow.
---
## Shared evidence ecosystems: data observatories
- Organizations that cannot afford to build a large enough data team to sustain consistent, extensive data collection and processing (many large institutions and companies)
- Who cannot hire even a single data engineer or a data scientist
- Who do not even have a permanent IT function (about 2 million European small enterprises and civil organizations)
---
--->
&lt;section data-noprocess data-shortcode-slide
data-background-image="contest-hague-award-2022.webp"
>
&lt;hr>
&lt;!---
&lt;div class="r-stack">
&lt;img class="fragment fade-out" data-fragment-index="0" src="https://placekitten.com/450/300" width="450" height="300">
&lt;img class="fragment current-visible" data-fragment-index="0" src="https://placekitten.com/300/450" width="300" height="450">
&lt;img class="fragment" src="https://placekitten.com/400/400" width="400" height="400">
&lt;/div>
---
&lt;div class="r-stack">
&lt;img class="fragment" src="https://placekitten.com/450/300" width="450" height="300">
&lt;img class="fragment" src="https://placekitten.com/300/450" width="300" height="450">
&lt;img class="fragment" src="https://placekitten.com/400/400" width="400" height="400">
&lt;/div>
---
## What are data observatories?
- There are more than 60 functional, and about 20 already discontinued data observatories, i.e. long-term, usually triangular (business, academic, policy) data collection institutions recognized by the EU, OECD or UNESCO, including the [European Observatory on Infringements of Intellectual Property Rights](https://single-market-economy.ec.europa.eu/industry/strategy/intellectual-property/enforcement-intellectual-property-rights/european-observatory-infringements-intellectual-property-rights_en#:~:text=The%20European%20Observatory%20on%20Infringements,countries%2C%20businesses%20and%20civil%20society.) of the EU or the [European Audiovisual Observatory](https://www.obs.coe.int/en/web/observatoire) of the Council of Europe.
---
--->
&lt;h2 id="do-it-smarter">Do it Smarter&lt;/h2>
&lt;ul>
&lt;li>They usually do not exchange standard data with statistical agencies, they are not synchronized on knowledge graphs of the Europeana or national libraries, and their research output is usually not to be found on open science repositories.&lt;/li>
&lt;li>The Hague is the winner of the &lt;a href="https://thehague.com/businessagency/the-hague-the-winner-world-smart-city-award-2021" target="_blank" rel="noopener">World Smart City Award 2021&lt;/a>, and we would like to attract the planned European Music Observatory and other, EU/UNESCO recognized institutions into the town building on the innovations of Reprex and the ecosystem of the Hague.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="questions">Questions?&lt;/h1>
&lt;p>&lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">Email&lt;/a> | &lt;a href="https://keybase.io/team/reprexcommunity" target="_blank" rel="noopener">Keybase&lt;/a>&lt;/p>
&lt;p>LinkedIn: &lt;a href="https://www.linkedin.com/in/antaldaniel/" target="_blank" rel="noopener">Daniel Antal&lt;/a> - &lt;a href="https://www.linkedin.com/company/68855596" target="_blank" rel="noopener">Reprex&lt;/a>&lt;/p>
&lt;p>&lt;a href="https://reprex.nl/" target="_blank" rel="noopener">Home&lt;/a> - &lt;a href="https://reprex.nl/talk/impactcity-startup-support-xl/" target="_blank" rel="noopener">One Pager ImpactCity Startup Support XL&lt;/a>&lt;/p></description></item><item><title>Reprex Nominated for The Hague Innovators Award</title><link>https://reprex-next.netlify.app/post/2022-09-13-the-hague-innovators-award/</link><pubDate>Tue, 13 Sep 2022 08:12:00 +0200</pubDate><guid>https://reprex-next.netlify.app/post/2022-09-13-the-hague-innovators-award/</guid><description>&lt;div class="alert alert-note">
&lt;div>
Reprex is a finalist for The Hague Innovators Award 2022, and the prize of the audience, in the startup category with our respectable competitors, Sibö, WECO, STHRIVE and ECOBLOQ.
&lt;/div>
&lt;/div>
&lt;p>The transition towards a sustainable and inclusive economy depends on collaboration. That is why we are bringing together startups, scale-ups, investors, policymakers, and other impact makers from around the world in The Hague for the 7th edition of ImpactFest.&lt;/p>
&lt;p>With the &lt;a href="https://www.impactcity.nl/en/service/the-hague-innovators-challenge/" target="_blank" rel="noopener">The Hague Innovators Challenge&lt;/a>, the municipality of The Hague challenges startups, scale-ups, and students to present their innovative ideas for global issues, as described in the UN Sustainable Development Goals (SDGs).&lt;/p></description></item><item><title>Hague Innovators Award 2022</title><link>https://reprex-next.netlify.app/slides/hague-innovation-award-2022/</link><pubDate>Wed, 17 Aug 2022 12:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/slides/hague-innovation-award-2022/</guid><description>
&lt;section data-noprocess data-shortcode-slide
data-background-image="contest-hague-award-2022.webp"
>
&lt;hr>
&lt;h1 id="data-observatory-30">Data Observatory 3.0&lt;/h1>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="dataobservatory-mission-statement.png"
>
&lt;hr>
&lt;h2 id="controls">Controls&lt;/h2>
&lt;ul>
&lt;li>Next: &lt;code>Right Arrow&lt;/code> or &lt;code>Space&lt;/code>&lt;/li>
&lt;li>Previous: &lt;code>Left Arrow&lt;/code>&lt;/li>
&lt;li>Start: &lt;code>Home&lt;/code>&lt;/li>
&lt;li>Finish: &lt;code>End&lt;/code>&lt;/li>
&lt;li>Overview: &lt;code>Esc&lt;/code>&lt;/li>
&lt;li>Speaker notes: &lt;code>S&lt;/code>&lt;/li>
&lt;li>Fullscreen: &lt;code>F&lt;/code>&lt;/li>
&lt;li>Zoom: &lt;code>Alt + Click&lt;/code>&lt;/li>
&lt;li>&lt;a href="https://github.com/hakimel/reveal.js#pdf-export" target="_blank" rel="noopener">PDF Export&lt;/a>: &lt;code>E&lt;/code>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="shared-evidence-ecosystems-data-observatories">Shared evidence ecosystems: data observatories&lt;/h2>
&lt;ul>
&lt;li>Organizations that cannot afford to build a large enough data team to sustain consistent, extensive data collection and processing (many large institutions and companies)&lt;/li>
&lt;li>Who cannot hire even a single data engineer or a data scientist&lt;/li>
&lt;li>Who do not even have a permanent IT function (about 2 million European small enterprises and civil organizations)&lt;/li>
&lt;/ul>
&lt;h2 id="what-are-data-observatories">What are data observatories?&lt;/h2>
&lt;ul>
&lt;li>There are more than 60 functional, and about 20 already discontinued data observatories, i.e. long-term, usually triangular (business, academic, policy) data collection institutions recognized by the EU, OECD or UNESCO, including the &lt;a href="https://single-market-economy.ec.europa.eu/industry/strategy/intellectual-property/enforcement-intellectual-property-rights/european-observatory-infringements-intellectual-property-rights_en#:~:text=The%20European%20Observatory%20on%20Infringements,countries%2C%20businesses%20and%20civil%20society." target="_blank" rel="noopener">European Observatory on Infringements of Intellectual Property Rights&lt;/a> of the EU or the &lt;a href="https://www.obs.coe.int/en/web/observatoire" target="_blank" rel="noopener">European Audiovisual Observatory&lt;/a> of the Council of Europe.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="do-it-smarter">Do it Smarter&lt;/h2>
&lt;ul>
&lt;li>They usually do not exchange standard data with statistical agencies, they are not synchronized on knowledge graphs of the Europeana or national libraries, and their research output is usually not to be found on open science repositories.&lt;/li>
&lt;li>The Hague is the winner of the &lt;a href="https://thehague.com/businessagency/the-hague-the-winner-world-smart-city-award-2021" target="_blank" rel="noopener">World Smart City Award 2021&lt;/a>, and we would like to attract the planned European Music Observatory and other, EU/UNESCO recognized institutions into the town building on the innovations of Reprex and the ecosystem of the Hague.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="contest-hague-award-2022.webp"
>
&lt;hr>
&lt;h2 id="strategic-objectives">Strategic objectives&lt;/h2>
&lt;ul>
&lt;li>Develop our data observatories as &lt;a href="https://openscholarlyinfrastructure.org/posse/" target="_blank" rel="noopener">Open Scholarly Infrastructure&lt;/a>&lt;/li>
&lt;li>Place our &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observtory&lt;/a>, &lt;a href="https://ccsi.dataobservatory.eu/" target="_blank" rel="noopener">Cultural and Creative Data Observatory&lt;/a>, and &lt;a href="https://greendeal.dataobservatory.eu/" target="_blank" rel="noopener">Green Deal Data Observaotry&lt;/a> on knowledge graphs of Europeana, Wikidata, and other open knowledge sytems&lt;/li>
&lt;li>Harmonize research artefacts with open repositories such as Zenodo and Figshare.&lt;/li>
&lt;li>Achieve EU/UNESCO/OECD recognition for our self-governing, triangular, science-policy-busines triangular data ecosystems as &lt;em>data observatories&lt;/em>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="digital-music-observatory">Digital Music Observatory&lt;/h2>
&lt;ul>
&lt;li>&lt;code>Listen Local&lt;/code> in Horizon Europe OpenMuse WP Diversity, Creative Europe MusicAIRE: connected and curated data on 10,000s of music works&lt;/li>
&lt;li>Our aim is to describe the entire, currently legally available music repertoire of Slovakia and Lithuania at first, and a large part of Ukraine.&lt;/li>
&lt;li>Connected with name authorities, web services.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="listen-local-slide.webp"
>
&lt;hr>
&lt;h2 id="possible-collaboration">Possible Collaboration&lt;/h2>
&lt;ul>
&lt;li>Connect national collective management organization, national library, and various services (Spotify, YouTube) to make the national repertoire more visible&lt;/li>
&lt;li>Create use statistics for cultural diversity policies and monitoring local content regulations&lt;/li>
&lt;li>Provide best practice example and open source tools for replication&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="creative-and-cultural-sectors-industries-data-observatory">Creative and Cultural Sectors Industries Data Observatory&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="ccsi-data-observatory.webp"
>
&lt;hr>
&lt;h2 id="possible-collaboration-1">Possible Collaboration&lt;/h2>
&lt;ul>
&lt;li>The &lt;a href="https://ccsi.dataobservatory.eu/" target="_blank" rel="noopener">CCSI Data Observatory&lt;/a> already has some data assets on Zenodo, and we can upgrade its API (both as Rest API with datacube and with a simple RDF serialization)&lt;/li>
&lt;li>Create use statistics for cultural heritage objects and other cultural heritage policy data&lt;/li>
&lt;li>Revisit some modest deliverables of RECREO and seek new funding.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="green-deal-data-observatory">Green Deal Data Observatory&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="green-deal-europeana-slide.webp"
>
&lt;hr>
&lt;h2 id="possible-collaboration-2">Possible Collaboration&lt;/h2>
&lt;ul>
&lt;li>The &lt;a href="https://greendeal.dataobservatory.eu/" target="_blank" rel="noopener">Green Deal Data Observatory&lt;/a> is currently developed to provide free or very accessible environmental, social and governance reporting tools to the cultural sector.&lt;/li>
&lt;li>It could also be used to provide ecological context to cultural heritage objects (CHO) for greater awareness.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="technical-features">Technical Features&lt;/h1>
&lt;p>&lt;a href="https://reprex.nl/" target="_blank" rel="noopener">Reprex&lt;/a> | &lt;a href="https://introduction.dataobservatory.eu/" target="_blank" rel="noopener">Documentation&lt;/a>&lt;/p>
&lt;hr>
&lt;h2 id="fair">FAIR&lt;/h2>
&lt;ul>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> FAIR metadata: Dublin Core &amp;amp; DataCite referential metadata&lt;/li>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> Integration to FigShare and Zenodo for automated releases and publications&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="web-30">Web 3.0&lt;/h2>
&lt;p>&lt;small> &lt;/p>
&lt;ul>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> supported with optional, open source APIs to retrieve the data&lt;/li>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> supported with RDF serialization&lt;/li>
&lt;/ul>
&lt;p>  &lt;/small>&lt;/p>
&lt;hr>
&lt;h2 id="dissemination-support">Dissemination Support&lt;/h2>
&lt;ul>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> support automated publishing and releasing of data, visualizations, newsletters, and long-form documentation in auto-refreshing websites, blogposts, or articles, or even books.&lt;/li>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> develop an ecosystem of open source software that helps the professional collection, processing, documentation of data conforming the Data Governance Act, and supporting data sharing and data altruism.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="research-automation">Research Automation&lt;/h1>
&lt;hr>
&lt;h2 id="research-automation-1">Research automation&lt;/h2>
&lt;ul>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> support research automation&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="questions">Questions?&lt;/h1>
&lt;p>&lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">Email&lt;/a>&lt;/p>
&lt;p>LinkedIn: &lt;a href="https://www.linkedin.com/in/antaldaniel/" target="_blank" rel="noopener">Daniel Antal&lt;/a> - &lt;a href="https://www.linkedin.com/company/79286750" target="_blank" rel="noopener">Digital Music Observatory&lt;/a>&lt;/p></description></item><item><title>Slides</title><link>https://reprex-next.netlify.app/slides/data-observatory/</link><pubDate>Wed, 17 Aug 2022 12:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/slides/data-observatory/</guid><description>&lt;h1 id="big-data-creates-inequalities">Big Data Creates Inequalities&lt;/h1>
&lt;p>Only the largest corporations, best-endowed universities, and rich governments can afford data collection and processing capacities that are large enough to harness the advantages of AI.&lt;/p>
&lt;hr>
&lt;h2 id="slide-navigation">Slide navigation&lt;/h2>
&lt;p>Fullscreen: &lt;code>F&lt;/code>&lt;/p>
&lt;ul>
&lt;li>Next: &lt;code>&amp;gt;&lt;/code> or &lt;code>Space&lt;/code> | Previous: &lt;code>&amp;lt;&lt;/code>&lt;/li>
&lt;li>Start: &lt;code>Home&lt;/code> | Finish: &lt;code>End&lt;/code>&lt;/li>
&lt;li>Overview: &lt;code>Esc&lt;/code>| Speaker notes: &lt;code>S&lt;/code>&lt;/li>
&lt;li>Zoom: &lt;code>Alt + Click&lt;/code>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="data-problems">Data problems&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:200px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="difficulty_bills_levels.jpg" height="130">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">The cost of questionnaire-based market research (survey) is increasing exponentially and offers mediocre results without an enormous question bank and harmonization with other surveys.(See &lt;a href="https://reprex.nl/data/surveys/" target="_blank" rel="noopener">🖱 blogpost&lt;/a>) &lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="photo-1490004047268-5259045aa2b4.jpg" height="130">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">Manual data acquisition is an error-prone and boring task for humans that requires many working hours (often not credited in consultancies, law firms, or research institutes.)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="Sisyphus_Bodleian_Library.png" height="130">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">Wrangling spreadsheet tables or word processor documents by people without data knowledge is the &lt;a href="https://reprex.nl/post/2021-07-08-data-sisyphus/" target="_blank" rel="noopener">🖱 data Sisyphus&lt;/a>.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="data-observatories-30">Data observatories 3.0&lt;/h2>
&lt;p style="font-size:90%">Reprex is offering shared data ecosystems. Our observatories are great solutions for organizations without a data specialization:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">🌳 Organizations that cannot afford to build a large enough data team to sustain consistent, extensive data collection and processing (many large institutions and companies)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">🪴 Who cannot hire even a single data engineer or a data scientist (medium-sized companies, NGOs)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td style="text-align:left">&lt;p style="font-size:75%">🌱 Who do not even have a permanent IT function (about 2 million European small enterprises and civil society organizations)&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:200px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="observatory_collage_3x2_800.png" height="140">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:60%">The European Union, the World Bank, OECD, and UN have facilitated the creation of more than 80 so-called &amp;lsquo;data observatories&amp;rsquo; to help companies, researchers, NGOs, and governments systematically collect data and knowledge.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="https://placekitten.com/400/400" width="140" height="140">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:60%">We are currently building one prototype for the European Music Observatory financed by the European Union and music industry players (cc 3-4 million euros.) We would like to take over existing or start new observatories in 2 years at least 5)&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="https://placekitten.com/400/400" width="140" height="140">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:60%">Our observatories are competitive, because they use high-quality open source scientific software; they exploit the new Data Governance Act and Open Data Directive, deploy web 3.0 data synchronization, and offer great value-added research products.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Platform products&lt;/th>
&lt;th>Value added data applications&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;p style="font-size:75%">The European Union, the World Bank, OECD, and UN have facilitated the creation of more than 80 so-called &amp;lsquo;data observatories&amp;rsquo; to help companies, researchers, NGOs, and governments systematically collect data and knowledge.&lt;/p>&lt;/td>
&lt;td>&lt;p style="font-size:75%">The different observatories offer different types of knowledge products, such as statistical yearbooks, various apps, and database access.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;p style="font-size:75%">Most of them use web 1.0 technologies, inefficient knowledge accumulation. Already 20 of them have been discontinued.&lt;/td>
&lt;td>&lt;p style="font-size:75%">We are developing software solutions that exploit our platforms: we harmonize surveys, statistical data, automate research reporting, elements of market monitoring or ESG reporting.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;p style="font-size:75%"> We are currently building one prototype for the European Music Observatory financed by the European Union and music industry players (cc 3-4 million euros.) We would like to take over existing or start new observatories in 2 years at least 5) &lt;/p>&lt;/td>
&lt;td>&lt;p style="font-size:75%">Each observatory gives us intimidate customer access to 3-4 large universities, 1-2 large consultancies, and various specialist institutions. &lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="marketing-strategy">Marketing strategy&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:160px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="dmo_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Buma/Stemra like copyright management agencies, music export offices, festivals and venues, University of Amsterdam, Sant’Anna, Economic University of Bratislava, ministries of culture, grant agencies.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="ccsi_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">University of Amsterdam, Europeana, Sant’Anna, Hungarian Film Fund&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="gdo_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Connected financial and sustainability reporting: bank consultancies, big four audit companies, large environmental NGOs.&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="cdo_opening_page_20220920_16x9.png" width="160">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%">Antitrust agencies, law firms, economics consultancies working with mergers and other competition related issues.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="target-market-size">Target market size&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;p style="font-size:55%">The observatory platforms usually have a build-up cost of about 3-5 million euros and an annual running costs of 0.1-3 million euros.&lt;/p>&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:55%"> Some of our basic products are included in the platform service. &lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;p style="font-size:55%"> Our existing observatories give us access to the market and public surveying markets (cc 30-40 billion euros in the developed nations), particularly to its software component (about 10 billion euros). &lt;/p>&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;p style="font-size:55%">&lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize&lt;/a> integrates pre-existing questionnaire-based surveys and new surveys. It is aimed at large, international survey companies (Kantar, Gfk) and large international survey programs (Eurostat, GESIS).&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>regions improves the granularity of existing market research with ‘small area statistics’.&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;p style="font-size:55%">Our existing observatories gave us access to environmental impact assessment and currently we build an ESG reporting tool with a central bank, a value bank, and a big four company. &lt;/p>&lt;/td>
&lt;td style="text-align:left">&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;p style="font-size:55%">We hope to gain at least 10% global market share on the observatory platform management market to pay our basic data science team and R&amp;amp;D. &lt;/p>&lt;/td>
&lt;td style="text-align:left">&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="team">Team&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;div style="width:200px">&lt;/div>&lt;/th>
&lt;th style="text-align:left">&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;img src="reprex_contributors_20220920_2_1.png" width="200">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">The two co-founders, &lt;a href="https://reprex.nl/authors/daniel_antal/" target="_blank" rel="noopener">🖱 Daniel Antal, CFA&lt;/a> and &lt;a href="https://reprex.nl/authors/andres/" target="_blank" rel="noopener">🖱 Andrés García Molina, PhD&lt;/a>, and the core team manage the ecosystems&amp;rsquo; development, develop knowledge management, and direct the software development. &lt;a href="https://reprex.nl/#team" target="_blank" rel="noopener">🖱 Team on full screen&lt;/a>&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;img src="dmo_contributors_20220920_2_1.png" width="200">&lt;/td>
&lt;td style="text-align:left">&lt;p style="font-size:65%">Each observatory has a broader team of users, data and knowledge curators, and developers. The most developed &lt;a href="https://music.dataobservatory.eu/#contributors" target="_blank" rel="noopener">🖱️ Digital Music Observatory&lt;/a> has 16 institutional users and a team of about 20 music and data professionals. The newer observatories have a smaller, initial service development and data curatorial team.&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="traction">Traction&lt;/h2>
&lt;h2 id="we-have-a-good-organic-growth-but-we-mainly-work-with-very-large-private-and-public-entities-with-complicated-lengthy-and-not-startup-friendly-procurement-with-a-cash-conversion-cycle-of-up">We have a good, organic growth, but we mainly work with very large private and public entities with complicated, lengthy and not startup-friendly procurement with a cash-conversion cycle of up&lt;/h2>
&lt;h2 id="funding">Funding&lt;/h2>
&lt;ul>
&lt;li>We have a good track record in EU tenders, but we would like to build up this reputation in the Netherlands, too, mainly for new platforms.&lt;/li>
&lt;li>We help our non-profit users, such as cultural heritage organizations, music export offices, collective rights management agencies to get funding to use our platforms and services. We have a track record in the EU, Slovakia, Lithuania, but not in the Netherlands.&lt;/li>
&lt;li>Our for profit-users need a more polished, user-friendlier front-end. Some are interested in joint ventures (like exploiting our survey capabilities). Venture capital would be preferred, as demand outstrips growth.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&amp;laquo;&amp;laquo;&amp;laquo;&amp;lt; HEAD&lt;/p>
&lt;!---
=======
&lt;&lt;&lt;&lt;&lt;&lt;&lt; HEAD
=======
&lt;!---
>>>>>>> 3f54846 (slide navigation)
>>>>>>> 492e595ef6d7e32fb335eebe67954d0d4bbcd7b9
## Pool and take over work where humans fail
- The cost of questionnaire-based market research (survey) is increasing exponentially and offers mediocre results without an enormous question bank and harmonization with other surveys.
- Manual data acquisition is an error-prone and boring task for humans that requires many working hours (often not credited in consultancies, law firms, or research institutes.)
- Wrangling spreadsheet tables or word processor documents by people without data knowledge is the data Sisyphus.
---
## Open source software and open platform
- Our survey harmonization tool offers hundreds of thousands of answers for your questionnaire item from dozens of countries and many years. We reduce the market research cost while exponentially increasing its value with data harmonization.
- We use automated statistical software or web 3.0 technology to synchronize data automatically with our client's database, dashboard, or spreadsheet.
- Our observatories automate repetitive processing tasks like re-formatting, currency translation, measurement units, documentation, bibliography, and hypertext link management with many computerized 'unit tests.' We let the computer do the work where humans often make errors or remain hopelessly slow.
---
## Shared evidence ecosystems: data observatories
- Organizations that cannot afford to build a large enough data team to sustain consistent, extensive data collection and processing (many large institutions and companies)
- Who cannot hire even a single data engineer or a data scientist
- Who do not even have a permanent IT function (about 2 million European small enterprises and civil organizations)
---
--->
&lt;section data-noprocess data-shortcode-slide
data-background-image="contest-hague-award-2022.webp"
>
&lt;hr>
&lt;!---
&lt;div class="r-stack">
&lt;img class="fragment fade-out" data-fragment-index="0" src="https://placekitten.com/450/300" width="450" height="300">
&lt;img class="fragment current-visible" data-fragment-index="0" src="https://placekitten.com/300/450" width="300" height="450">
&lt;img class="fragment" src="https://placekitten.com/400/400" width="400" height="400">
&lt;/div>
---
&lt;div class="r-stack">
&lt;img class="fragment" src="https://placekitten.com/450/300" width="450" height="300">
&lt;img class="fragment" src="https://placekitten.com/300/450" width="300" height="450">
&lt;img class="fragment" src="https://placekitten.com/400/400" width="400" height="400">
&lt;/div>
---
## What are data observatories?
- There are more than 60 functional, and about 20 already discontinued data observatories, i.e. long-term, usually triangular (business, academic, policy) data collection institutions recognized by the EU, OECD or UNESCO, including the [European Observatory on Infringements of Intellectual Property Rights](https://single-market-economy.ec.europa.eu/industry/strategy/intellectual-property/enforcement-intellectual-property-rights/european-observatory-infringements-intellectual-property-rights_en#:~:text=The%20European%20Observatory%20on%20Infringements,countries%2C%20businesses%20and%20civil%20society.) of the EU or the [European Audiovisual Observatory](https://www.obs.coe.int/en/web/observatoire) of the Council of Europe.
---
--->
&lt;h2 id="do-it-smarter">Do it Smarter&lt;/h2>
&lt;ul>
&lt;li>They usually do not exchange standard data with statistical agencies, they are not synchronized on knowledge graphs of the Europeana or national libraries, and their research output is usually not to be found on open science repositories.&lt;/li>
&lt;li>The Hague is the winner of the &lt;a href="https://thehague.com/businessagency/the-hague-the-winner-world-smart-city-award-2021" target="_blank" rel="noopener">World Smart City Award 2021&lt;/a>, and we would like to attract the planned European Music Observatory and other, EU/UNESCO recognized institutions into the town building on the innovations of Reprex and the ecosystem of the Hague.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="questions">Questions?&lt;/h1>
&lt;p>&lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">Email&lt;/a> | &lt;a href="https://keybase.io/team/reprexcommunity" target="_blank" rel="noopener">Keybase&lt;/a>&lt;/p>
&lt;p>LinkedIn: &lt;a href="https://www.linkedin.com/in/antaldaniel/" target="_blank" rel="noopener">Daniel Antal&lt;/a> - &lt;a href="https://www.linkedin.com/company/68855596" target="_blank" rel="noopener">Reprex&lt;/a>&lt;/p></description></item><item><title>Developing a software-as-service solution for micro-, and small enterprises</title><link>https://reprex-next.netlify.app/post/2022-06-09-music-eviota/</link><pubDate>Thu, 09 Jun 2022 09:40:00 +0100</pubDate><guid>https://reprex-next.netlify.app/post/2022-06-09-music-eviota/</guid><description>&lt;p>The music sector must increase its environmental and social (ESG) sustainability management to meet the challenges of the climate emergency and to make the music sector a fairer, more just workplace for womxn and artists coming from minorities, small countries. The European Union will make target setting and audited reporting mandatory in environmental and social sustainability for large companies. The application of these new accounting, reporting and disclosure rules are optional for the music sector where almost all entities are micro-, or small enterprises and civil society organizations.&lt;/p>
&lt;p>Even if music organizations are not pushed by regulators to adopt these new standards, it is in their best interest to take the initiative on the principle of subsidiarty, and develop tools that can be applied as an extension to their simplified financial and tax reporting. Music organizations and businesses that can prove that they are making progress in reducing their carbon footprint, making their water use more sustainable, and they provide equal opportunities for womxn, they will be eligible for new, green bank and insurance products (which are particularly important in live music) and can attract new sponsors and donors.&lt;/p>
&lt;p>Compliance with these new rules is very costly, because tools are being developed for stock-exchange listed big companies and financial institutions. The Commission&amp;rsquo;s impact assessment (SWD/2021/150 final) estimates the cost of compliance with the Corporate Social Responsibility Directive exceeding 4 bn euros for the European companies or around 10,000 euros per company. Reprex, working together with large accounting, audit and value-based banking partners, scientific, research and industry partners in the Digital Music Observatory open knowledge collaboration, hopes to bring down this cost below 500 euros, which will immediately pay off when a music organization receives green money.&lt;/p>
&lt;p>We are working on a simple interface that can connect the accounting system of micro and small enterprises with new methodologies, starting with greenhouse gas reporting with Reprex’s open source EEIO application &lt;a href="https://iotables.dataobservatory.eu/" target="_blank" rel="noopener">iotables&lt;/a>. We will keep many aspects of our software and data solution open, so that later methodological innovations and scientific achievements can be easily incorporated into the system. Reprex’s minimum viable product will be created in four iteration rounds in Malta, Czechia, Bulgaria and Belgium. However, our testing is open for any amount of donations to any music entities in the European Union who can provide input data in English or Dutch, or be able to pay for their translation and localization costs.&lt;/p>
&lt;p>Link: &lt;a href="https://musicaire.eu/2022/07/12/final-list-of-awarded-projects/" target="_blank" rel="noopener">Final List of Awareded Projects by MusicAIRE&lt;/a>&lt;/p></description></item><item><title>Trustworthy Autonomous Recommender Systems on Music Streaming Platforms</title><link>https://reprex-next.netlify.app/post/2022-02-28-tas/</link><pubDate>Mon, 28 Feb 2022 19:00:00 +0100</pubDate><guid>https://reprex-next.netlify.app/post/2022-02-28-tas/</guid><description>&lt;p>Currently almost 60% of the global recording industry sales are made via streaming platforms. Given the enormity of choice on these platforms, and that music listening is a low-key, routine consumption choice, consumers are more and more relying on the recommendations of autonomous recommendation systems. Streaming platforms are two-sided markets, where recommendations are deployed to enhance the user experience on the consumer side, but they also decide the fate of the investments that composers, lyricists, producers, and performers made into the music. We are going to contribute to a research on how such systems may lead to potentially tilted competition field between the content providers, and more specifically, between major labels and independents.&lt;/p>
&lt;p>Reprex maintains the &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observatory&lt;/a> and the &lt;a href="https://reprex.nl/publication/listen_local_2020/" target="_blank" rel="noopener">Listen Local&lt;/a> system for granular microdata about music use in small territories (i.e., on small country or sub-national level.) We will provide data/expertise in music streaming and recommendation systems and links to many relevant stakeholders with our considerable experience running experiments on music platforms.&lt;/p>
&lt;p>A research team of the University of East Anglia (UEA) the University of Liverpool (UoL), The University of London (City), and King’s College (KCL), supported by the Competition Market Authority of the United Kingdom and Reprex won a prestigious research grant to understand how recommender systems on music streaming platforms can employ trustworthy AI.&lt;/p>
&lt;p>The researchers will explore the relationship between the autonomous recommendation systems and entry barriers via simulation. Working closely with Reprex, they will simulate sets of users, and iteratively generate recommendation lists, which the simulated users will react to by deciding how long to engage for and which recommendations to listen to. Through their engagement their user profiles will be updated based on what they listen to which will feed into future recommendations.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-see-our-feasibility-study-for-listen-localhttpsreprexnlpublicationlisten_local_2020">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/media/img/reports/listen_local_2020/listen_local_study_covers.png" alt="See our Feasibility Study for [Listen Local](https://reprex.nl/publication/listen_local_2020/)." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
See our Feasibility Study for &lt;a href="https://reprex.nl/publication/listen_local_2020/" target="_blank" rel="noopener">Listen Local&lt;/a>.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>The empirical experiments of the project want to explore how autonomous recommendation systems are driving consumer choice in a real-life setting, and to establish causality between the recommendation systems and the barrier to entry. As part of the second work package, the researchers will conduct randomised trials by inviting participants to stream music through our own user interface. Reprex has extensive experience conducting similar experiments in the music domain (for various online, field experiments, and high-quality surveys.)&lt;/p>
&lt;p>Link: &lt;a href="https://www.tas.ac.uk/News/eight-new-tas-research-projects-announced/" target="_blank" rel="noopener">Eight new TAS research projects announced&lt;/a>&lt;/p></description></item><item><title>Reproducible Economic Impact Assessment</title><link>https://reprex-next.netlify.app/post/2021-12-20-environmental_impact/</link><pubDate>Sun, 19 Dec 2021 13:00:00 +0100</pubDate><guid>https://reprex-next.netlify.app/post/2021-12-20-environmental_impact/</guid><description>&lt;td style="text-align: center;">
&lt;figure id="figure-get-started-with--iotableshttpsiotablesdataobservatoryeuindexhtml">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/media/img/blogposts_2021/iotables_0_4_7.png" alt="Get started with [iotables](https://iotables.dataobservatory.eu/index.html)." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
Get started with &lt;a href="https://iotables.dataobservatory.eu/index.html" target="_blank" rel="noopener">iotables&lt;/a>.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>We made an important, peer-reviewed release of iotables in the last week as a preparation to increase the functionality of our open-source software. The official release of the iotables R package currently works with economic impact assessments, and can evaluate the likely employment, tax, wage, or gross value added direct, indirect and multiplied impacts of various policy changes in about 30 countries.&lt;/p>
&lt;p>Originally the package was developed to calculate the economic impact of the Hungarian film tax shelter and the impact of the music sector on the Slovak economy. (See &lt;a href="https://music.dataobservatory.eu/publication/slovak_music_industry_2019/" target="_blank" rel="noopener">Slovak Music Industry Report&lt;/a>).&lt;/p>
&lt;p>The new CRAN release improved the documentation of the function and removed most outdated dependencies. The new, development version (which did not go through peer-review yet) is adding new functionality for environmental impact analysis with the following pollutants: Carbon dioxide without emissions from biomass (&lt;code>CO2&lt;/code>), Carbon dioxide from biomass (&lt;code>Biomass CO2&lt;/code>), Nitroux oxide (&lt;code>N2O&lt;/code>), Methane (&lt;code>CH4&lt;/code>), Perfluorocarbons (&lt;code>PFCs&lt;/code>), Hydrofluorocarbons (&lt;code>HFCs&lt;/code>), Sulphur hexafluoride (SF6) including nitrogen trifluoride (&lt;code>NF3&lt;/code>), Nitrogen oxides (&lt;code>NOx&lt;/code>), Non-methane volatile organic compounds, (NMVOC), Carbon monoxide (&lt;code>CO&lt;/code>), Particulate matter &amp;lt; 10μm (&lt;code>PM10&lt;/code>), Particulate matter &amp;lt; 2,5μm (&lt;code>PM2,5&lt;/code>), Sulphur dioxide (&lt;code>SO2&lt;/code>), Ammonia (&lt;code>NH3&lt;/code>) and their combinations (see &lt;a href="https://ec.europa.eu/eurostat/cache/metadata/en/env_ac_ainah_r2_esms.htm" target="_blank" rel="noopener">Reference Metadata in Euro SDMX Metadata Structure (ESMS)&lt;/a>).&lt;/p>
&lt;p>Our aim is to develop new sustainable finance applications, and understand the sustainability impacts of bank’s lending activities and insurer’s underwriting activities on climate change mitigation and adoption, biodiversity, preservation of water reservers, preventing pollution, and promoting the circular economy.&lt;/p>
&lt;h2 id="eu-taxonomy-on-sustainable-activities">EU Taxonomy on Sustainable Activities&lt;/h2>
&lt;p>The European Commission created an created an &lt;a href="https://ec.europa.eu/sustainable-finance-taxonomy/tool/index_en.htm" target="_blank" rel="noopener">EU Taxonomy Compass&lt;/a>, which provides a visual representation of the contents of the EU Taxonomy, starting with the Delegated Act on the climate objectives, as adopted on 4 June 2021. Whilst you can download the EU Taxonomy in &lt;code>xlsx&lt;/code> or &lt;code>json&lt;/code> format, they are not tidy datasets, and they are not particularly well-suited for calculations, filtering, or inclusion in applications.&lt;/p>
&lt;p>&lt;a href="https://reprex.nl/" target="_blank" rel="noopener">Reprex&lt;/a> created a tidy version of the EU Taxonomy for developing better sustainability indicators into the &lt;a href="https://greendeal.dataobservatory.eu/" target="_blank" rel="noopener">Green Deal Data Observatory&lt;/a>.&lt;/p>
&lt;h2 id="open-data">Open Data&lt;/h2>
&lt;td style="text-align: center;">
&lt;figure id="figure-eu-taxonomy-on-sustainable-activities-tidy-downloadhttpszenodoorgrecord5791921yb9un2jmliu">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="EU Taxonomy on Sustainable Activities (Tidy) [download](https://zenodo.org/record/5791921#.Yb9UN2jMLIU)." srcset="
/media/img/blogposts_2021/tidy_eu_taxonomy_on_zenodo_huc44c1a1ba6df93a8d236ece24ac631bf_205462_b833f168c2c83ac62651705c602098e9.webp 400w,
/media/img/blogposts_2021/tidy_eu_taxonomy_on_zenodo_huc44c1a1ba6df93a8d236ece24ac631bf_205462_26dcc2483d83ed1843846dbbd7ae2829.webp 760w,
/media/img/blogposts_2021/tidy_eu_taxonomy_on_zenodo_huc44c1a1ba6df93a8d236ece24ac631bf_205462_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/tidy_eu_taxonomy_on_zenodo_huc44c1a1ba6df93a8d236ece24ac631bf_205462_b833f168c2c83ac62651705c602098e9.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
EU Taxonomy on Sustainable Activities (Tidy) &lt;a href="https://zenodo.org/record/5791921#.Yb9UN2jMLIU" target="_blank" rel="noopener">download&lt;/a>.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>Using our &lt;code>iotables&lt;/code> is not for the faint heart. It is a scientific software, and it requires a good command of national accounts, input-output economics and sustainability to work with. Our Green Deal Data Observaotry is designed to be an API of scientific software, and produce clean, ready to use data for researchers, policy-makers and business planners who do not have the skills to work with scientific software. We are planning to release well-designed datasets that go through dozens of checks to make sure they have the best data quality.&lt;/p>
&lt;div class="alert alert-note">
&lt;div>
Do you want to develop input-output models for any European country to measure the direct and indirect green house gas impacts of policy actions? Do you need well-formatted data on interindustry linkages or other relevant topics for sustainable economy or susitainable finance research? Get in touch with us – we are happy to help and test our new software tool with data you need, and create high-quality, open datasets that are ready to use.
&lt;/div>
&lt;/div></description></item><item><title>Jumping Ahead With the Digital Music Observatory</title><link>https://reprex-next.netlify.app/post/2021-12-02-dmo-jump/</link><pubDate>Thu, 02 Dec 2021 13:00:00 +0100</pubDate><guid>https://reprex-next.netlify.app/post/2021-12-02-dmo-jump/</guid><description>&lt;p>Our &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observatory&lt;/a> project spent a year in the JUMP Music Market Accelerator&amp;rsquo;s program. Over the course of 9 months, co-founder Daniel Antal could meet many stakeholders from almost all European countries, meet other new music technology startups and projects, and got mentoring and other professional help to further develop the project.&lt;/p>
&lt;p>The Digital Music Observatory is one of the several initiatives to fill the data gaps of the fragmented European music ecosystems. While most of Europe’s music is available and promoted on data-heavy, AI-driven autonomous platforms like TikTok, Spotify, YouTube, Deezer, music labels, publishers, national export offices are lacking the necessary data solutions to remain competitive.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-daniel-is-pitching-for-partnership-with-music-tech-europe-on-linechech-and-finding-a-music-city-that-wants-to-be-the-seat-of-the-future-european-music-observatory-photo-wen-liuhttpwwwfestivalmarscomboard_memberwen-liu">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Daniel is pitching for partnership with Music Tech Europe on Linechech and finding a music city that wants to be the seat of the future European Music Observatory. Photo: [Wen Liu](http://www.festivalmars.com/?board_member=wen-liu/)." srcset="
/media/img/blogposts_2021/Daniel_Antal_JUMP_Linecheck_20211124_hubb7c99a8a93f52fd801facb5ffd737fb_546622_3654cf264af950c9a3f1454f74ad2286.webp 400w,
/media/img/blogposts_2021/Daniel_Antal_JUMP_Linecheck_20211124_hubb7c99a8a93f52fd801facb5ffd737fb_546622_054325de11d3eb6a4b8bcdf23a093705.webp 760w,
/media/img/blogposts_2021/Daniel_Antal_JUMP_Linecheck_20211124_hubb7c99a8a93f52fd801facb5ffd737fb_546622_1200x1200_fit_q75_h2_lanczos.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/Daniel_Antal_JUMP_Linecheck_20211124_hubb7c99a8a93f52fd801facb5ffd737fb_546622_3654cf264af950c9a3f1454f74ad2286.webp"
width="760"
height="557"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
Daniel is pitching for partnership with Music Tech Europe on Linechech and finding a music city that wants to be the seat of the future European Music Observatory. Photo: &lt;a href="http://www.festivalmars.com/?board_member=wen-liu/" target="_blank" rel="noopener">Wen Liu&lt;/a>.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>One of the recurring themes of 2021 was the notion that the music streaming economy is broken. Several JUMP fellows are working on various projects that aim to fix this, and our Digital Music Observatory has both the data and track record to provide evidence and test ideas about possible solutions – change in pricing, better targeting in export and domestic markets, and checking for algorithmic biases. See what we have done in the field this year in the UK IPO-initiated &lt;a href="https://music.dataobservatory.eu/publication/mce_empirical_streaming_2021/" target="_blank" rel="noopener">Music Creators&amp;rsquo; Earning&lt;/a> project; &lt;a href="https://music.dataobservatory.eu/publication/listen_local_2020/" target="_blank" rel="noopener">understanding algorithmic recommendation problems&lt;/a>
with the support of the Slovak Arts Council, and making recommendations about &lt;a href="https://music.dataobservatory.eu/publication/european_visibilitiy_2021/" target="_blank" rel="noopener">better music metadata and copyright regulation&lt;/a> with our research consortium.&lt;/p>
&lt;p>The other very interesting theme of the year was the emergence of new, immersive music tech companies. We hope that our Digital Music Observatory can grow into a hub for their data needs, too. How is the world of 2.7 billion gamer and music lovers is forming a new market for &lt;a href="https://www.ristband.co/" target="_blank" rel="noopener">Ristband&lt;/a>? We would also like to curate data about the healing effects of sound, and work in the future with immersive, functional music providers like &lt;a href="http://flowerofsound.machinejockey.net/" target="_blank" rel="noopener">Flower of Sound&lt;/a> who place music and sound design into a less stressful, more healthy acoustic environment.&lt;/p>
&lt;p>We were often criticized for placing too little emphasis on data visualization. Our next priority is to provide clear, beautiful infographs and charts to all of our datasets.&lt;/p>
&lt;p>There were many professionals who helped us in the JUMP program. We are particularly thankful for Alessanra di Caro (partnership building), Elodie Crouzet (program coordination), &lt;a href="http://stevefarrismusic.com/" target="_blank" rel="noopener">Steve Farris&lt;/a> (mentoring), &lt;a href="https://www.holz-consulting.de/en/veronique_friedrich/" target="_blank" rel="noopener">Veronique Friedrich&lt;/a> (team building), &lt;a href="https://speakerscoachbrussels.com/index.html" target="_blank" rel="noopener">Thierry Giesler&lt;/a> (improving our pitch) and Anna Zò (&lt;a href="https://musictecheurope.org/" target="_blank" rel="noopener">Music Tech Europe&lt;/a>).&lt;/p>
&lt;p>&lt;em>Are you a data user? Give us some feedback! Shall we do some further automatic data enhancements with our datasets? Document with different metadata? Link more information for business, policy, or academic use? Please ive us any &lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">feedback&lt;/a>!&lt;/em>&lt;/p></description></item><item><title>How We Add Value to Public Data With Better Curation And Documentation?</title><link>https://reprex-next.netlify.app/post/2021-11-08-indicator_findable/</link><pubDate>Mon, 08 Nov 2021 09:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-11-08-indicator_findable/</guid><description>&lt;p>In this example, we show a simple indicator: the &lt;em>Turnover in Radio Broadcasting Enterprises&lt;/em> in many European countries. This is an important demand driver in the &lt;em>Music economy&lt;/em> pillar of our &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observatory&lt;/a>, and important indicator in our more general &lt;a href="https://ccsi.dataobservatory.eu/" target="_blank" rel="noopener">Cultural &amp;amp; Creative Sectors and Industries Observatory&lt;/a>. Of course, if you work with competition policy or antitrust, than any industry may be interesting to you&amp;ndash;but not all of them are well-serverd with data.&lt;/p>
&lt;p>This dataset comes from a public datasource, the data warehouse of the
European statistical agency, Eurostat. Yet it is not trivial to use:
unless you are familiar with national accounts, you will not find &lt;a href="https://appsso.eurostat.ec.europa.eu/nui/show.do?dataset=sbs_na_1a_se_r2&amp;amp;lang=en" target="_blank" rel="noopener">this dataset&lt;/a> on the Eurostat website.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-the-data-can-be-retrieved-from-the-annual-detailed-enterprise-statistics-for-services-nace-rev2-h-n-and-s95-eurostat-folder">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="The data can be retrieved from the Annual detailed enterprise statistics for services NACE Rev.2 H-N and S95 Eurostat folder." srcset="
/media/img/blogposts_2021/eurostat_radio_broadcasting_turnover_hu3e5de6ecefe0d9a061359c052e94da60_424359_48e8a82bfbe25df03a25f8ae1d3f8ec0.webp 400w,
/media/img/blogposts_2021/eurostat_radio_broadcasting_turnover_hu3e5de6ecefe0d9a061359c052e94da60_424359_4a73306788813c6365f0a1ca45775cd5.webp 760w,
/media/img/blogposts_2021/eurostat_radio_broadcasting_turnover_hu3e5de6ecefe0d9a061359c052e94da60_424359_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/eurostat_radio_broadcasting_turnover_hu3e5de6ecefe0d9a061359c052e94da60_424359_48e8a82bfbe25df03a25f8ae1d3f8ec0.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
The data can be retrieved from the Annual detailed enterprise statistics for services NACE Rev.2 H-N and S95 Eurostat folder.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>Our version of this statistical indicator is documented following the &lt;a href="https://www.go-fair.org/fair-principles/" target="_blank" rel="noopener">FAIR principles&lt;/a>: our data assets
are findable, accessible, interoperable, and reusable. While the
Eurostat data warehouse partly fulfills these important data quality
expectations, we can improve them significantly. And we can also
improve the dataset, too, as we will show in the &lt;a href="https://reprex-next.netlify.app/post/2021-11-06-indicator_value_added/">next blogpost&lt;/a>.&lt;/p>
&lt;details class="toc-inpage d-print-none " open>
&lt;summary class="font-weight-bold">Table of Contents&lt;/summary>
&lt;nav id="TableOfContents">
&lt;ul>
&lt;li>&lt;a href="#findable-data">Findable Data&lt;/a>&lt;/li>
&lt;li>&lt;a href="#accessible-data">Accessible Data&lt;/a>&lt;/li>
&lt;li>&lt;a href="#interoperability">Interoperability&lt;/a>&lt;/li>
&lt;li>&lt;a href="#reuse">Reuse&lt;/a>&lt;/li>
&lt;/ul>
&lt;/nav>
&lt;/details>
&lt;h2 id="findable-data">Findable Data&lt;/h2>
&lt;p>Our data observatories add value by curating the data&amp;ndash;we bring this
indicator to light with a more descriptive name, and we place it in a domain-specific context with our &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observatory&lt;/a> and &lt;a href="https://ccsi.dataobservatory.eu/" target="_blank" rel="noopener">Cultural &amp;amp; Creative Sectors and Industries Observatory&lt;/a> and a policy-specific context with our &lt;em>Competition Data Observatory&lt;/em> and &lt;em>Green Deal Data Observatory&lt;/em>. While many people may need this dataset in the creative sectors, or among cultural policy designers, most of them have no training in working with
national accounts, which imply decyphering national account data codes in records that measure economic activity at a national level. Our curated data observatories bring together many available data around important domains. Our &lt;code>Digital Music Observatory&lt;/code>, for example, aims to form an ecosystem of music data users and producers.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-we-added-descriptive-metadatahttpszenodoorgrecord5652113yykvbwdmkuk-that-help-you-find-our-data-and-match-it-with-other-relevant-data-sources">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="We [added descriptive metadata](https://zenodo.org/record/5652113#.YYkVBWDMKUk) that help you find our data and match it with other relevant data sources." srcset="
/media/img/blogposts_2021/zenodo_metadata_eurostat_radio_broadcasting_turnover_hu2432360a17d3ae8402b8f8c002a73e1d_314223_59bab6a7b48930f62147f1d33751b26b.webp 400w,
/media/img/blogposts_2021/zenodo_metadata_eurostat_radio_broadcasting_turnover_hu2432360a17d3ae8402b8f8c002a73e1d_314223_83fa751371ea12ffcd5187968e2bc3da.webp 760w,
/media/img/blogposts_2021/zenodo_metadata_eurostat_radio_broadcasting_turnover_hu2432360a17d3ae8402b8f8c002a73e1d_314223_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/zenodo_metadata_eurostat_radio_broadcasting_turnover_hu2432360a17d3ae8402b8f8c002a73e1d_314223_59bab6a7b48930f62147f1d33751b26b.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
We &lt;a href="https://zenodo.org/record/5652113#.YYkVBWDMKUk" target="_blank" rel="noopener">added descriptive metadata&lt;/a> that help you find our data and match it with other relevant data sources.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>We added descriptive metadata that help you find our data and match it
with other relevant data sources. For example, we add keywords and
standardized metadata identifiers from the Library of Congress Linked
Data Services, probably the world’s largest standardized knowledge
library description. This ensures that you can find relevant data
around the same key term (&amp;quot;&lt;a href="https://id.loc.gov/authorities/subjects/sh85110448.html" target="_blank" rel="noopener">Radio broadcasting&lt;/a>&amp;quot;)
in addition to our turnover data. This allows connecting our dataset unambiguously
with other information sources that use the same concept, but may be listed under
different keywords, such as &lt;em>Radio–Broadcasting&lt;/em>, or &lt;em>Radio industry and
trade&lt;/em>, or maybe &lt;em>Hörfunkveranstalter&lt;/em> in German, or &lt;em>Emitiranje
radijskog programa&lt;/em> in Croatian or &lt;em>Actividades de radiodifusão&lt;/em> in
Portugese.&lt;/p>
&lt;h2 id="accessible-data">Accessible Data&lt;/h2>
&lt;p>Our data is accessible in two forms: in &lt;code>csv&lt;/code> tabular format (which can be
read with Excel, OpenOffice, Numbers, SPSS and many similar spreadsheet
or statistical applications) and in &lt;code>JSON&lt;/code> for automated importing into
your databases. We can also provide our users with SQLite databases,
which are fully functional, single user relational databases.&lt;/p>
&lt;p>Tidy datasets are easy to manipulate, model and visualize, and have a
specific structure: each variable is a column, each observation is a
row, and each type of observational unit is a table. This makes the data
easier to clean, and far more easier to use in a much wider range of
applications than the original data we used. In theory, this is a simple objective,
yet we find that even governmental statistical agencies&amp;ndash;and even scientific
publications&amp;ndash;often publish untidy data. This poses a significant problem that implies
productivity loses: tidying data will require long hours of investment, and if
a reproducible workflow is not used, data integrity can also be compromised:
chances are that the process of tidying will overwrite, delete, or omit a data or a label.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-tidy-datasetshttpsr4dshadconztidy-datahtml-are-easy-to-manipulate-model-and-visualize-and-have-a-specific-structure-each-variable-is-a-column-each-observation-is-a-row-and-each-type-of-observational-unit-is-a-table">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="[Tidy datasets](https://r4ds.had.co.nz/tidy-data.html) are easy to manipulate, model and visualize, and have a specific structure: each variable is a column, each observation is a row, and each type of observational unit is a table." srcset="
/media/img/blogposts_2021/tidy-8_hub5468e0441f3c23e1be9aa13622e5d1a_299553_840d5597bab1e4d7c2b314453bf83608.webp 400w,
/media/img/blogposts_2021/tidy-8_hub5468e0441f3c23e1be9aa13622e5d1a_299553_f01845e0e6967cc9a3a2b53cf12edd0a.webp 760w,
/media/img/blogposts_2021/tidy-8_hub5468e0441f3c23e1be9aa13622e5d1a_299553_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/tidy-8_hub5468e0441f3c23e1be9aa13622e5d1a_299553_840d5597bab1e4d7c2b314453bf83608.webp"
width="760"
height="355"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
&lt;a href="https://r4ds.had.co.nz/tidy-data.html" target="_blank" rel="noopener">Tidy datasets&lt;/a> are easy to manipulate, model and visualize, and have a specific structure: each variable is a column, each observation is a row, and each type of observational unit is a table.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>While the original data source, the Eurostat data warehouse is
accessible, too, we added value with bringing the data into a &lt;a href="https://www.jstatsoft.org/article/view/v059i10" target="_blank" rel="noopener">tidy
format&lt;/a>. Tidy data can
immediately be imported into a statistical application like SPSS or
STATA, or into your own database. It is immediately available for
plotting in Excel, OpenOffice or Numbers.&lt;/p>
&lt;h2 id="interoperability">Interoperability&lt;/h2>
&lt;p>Our data can be easily imported with, or joined with data from other internal or external sources.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-all-our-indicators-come-with-standardized-descriptive-metadata-and-statistical-processing-metadata-see-our-apihttpsapimusicdataobservatoryeudatabasemetadata">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="All our indicators come with standardized descriptive metadata, and statistical (processing) metadata. See our [API](https://api.music.dataobservatory.eu/database/metadata/) " srcset="
/media/img/observatory_screenshots/DMO_API_metadata_table_huec7c4d59af8b123db4454f856f161328_73739_bca19fc4770ab1d69e4e43df040c8c36.webp 400w,
/media/img/observatory_screenshots/DMO_API_metadata_table_huec7c4d59af8b123db4454f856f161328_73739_41b3d74277805b8a9efe561d4fa0fadb.webp 760w,
/media/img/observatory_screenshots/DMO_API_metadata_table_huec7c4d59af8b123db4454f856f161328_73739_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/observatory_screenshots/DMO_API_metadata_table_huec7c4d59af8b123db4454f856f161328_73739_bca19fc4770ab1d69e4e43df040c8c36.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
All our indicators come with standardized descriptive metadata, and statistical (processing) metadata. See our &lt;a href="https://api.music.dataobservatory.eu/database/metadata/" target="_blank" rel="noopener">API&lt;/a>
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>All our indicators come with standardized descriptive metadata,
following two important standards, the &lt;a href="https://dublincore.org/" target="_blank" rel="noopener">Dublin Core&lt;/a> and
&lt;a href="https://datacite.org/" target="_blank" rel="noopener">DataCite&lt;/a>–implementing not only the mandatory,
but the recommended descriptions, too. This will make it far easier to
connect the data with other data sources, e.g. turnover with the number of radio broadcasting enterprises or radio stations within specific territories.&lt;/p>
&lt;p>Our passion for documentation standards and best practices goes much further: our data uses &lt;a href="https://sdmx.org/?page_id=3215/" target="_blank" rel="noopener">Statistical Data and Metadata eXchange&lt;/a> standardized codebooks, unit descriptions and other statistical and administrative metadata.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-we-participate-in-scientific-workhttpsreprexnlpublicationeuropean_visibilitiy_2021-related-to-data-interoperability">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="We participate in [scientific work](https://reprex.nl/publication/european_visibilitiy_2021/) related to data interoperability." srcset="
/media/img/reports/european_visbility_publication_hu9fd9bf0ebbda97354d76a2e1b9589f6b_264884_25232c9bd0c86814e3e3337261110ea4.webp 400w,
/media/img/reports/european_visbility_publication_hu9fd9bf0ebbda97354d76a2e1b9589f6b_264884_93fa43b83c3a299d78a1afed7bc4f820.webp 760w,
/media/img/reports/european_visbility_publication_hu9fd9bf0ebbda97354d76a2e1b9589f6b_264884_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/reports/european_visbility_publication_hu9fd9bf0ebbda97354d76a2e1b9589f6b_264884_25232c9bd0c86814e3e3337261110ea4.webp"
width="760"
height="506"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
We participate in &lt;a href="https://reprex.nl/publication/european_visibilitiy_2021/" target="_blank" rel="noopener">scientific work&lt;/a> related to data interoperability.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;h2 id="reuse">Reuse&lt;/h2>
&lt;p>All our datasets come with standardized information about reusabililty.
We add citation, attribution data, and licensing terms. Most of our
datasets can be used without commercial restriction after acknowledging
the source, but we sometimes work with less permissible data licenses.&lt;/p>
&lt;p>In the case presented here, we added further value to encourage re-use. In addition to tidying, we significantly increased the usability of public data by handling
missing cases. This is the subject of our &lt;a href="https://reprex-next.netlify.app/post/2021-11-06-indicator_value_added/">next blogpost&lt;/a>.&lt;/p>
&lt;details class="spoiler " id="spoiler-6">
&lt;summary>Are you a data user? How could we serve you better?&lt;/summary>
&lt;p>&lt;em>Shall we do some further automatic data enhancements with our datasets? Document with different metadata? Link more information for business, policy, or academic use? Please get in touch with &lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">us&lt;/a>!&lt;/em>&lt;/p>
&lt;/details></description></item><item><title>How We Add Value to Public Data With Imputation and Forecasting</title><link>https://reprex-next.netlify.app/post/2021-11-06-indicator_value_added/</link><pubDate>Mon, 08 Nov 2021 10:00:00 +0100</pubDate><guid>https://reprex-next.netlify.app/post/2021-11-06-indicator_value_added/</guid><description>&lt;p>Public data sources are often plagued by missng values. Naively you may think that you can ignore them, but think twice: in most cases, missing data in a table is not missing information, but rather malformatted information. This approach of ignoring or dropping missing values will not be feasible or robust when you want to make a beautiful visualization, or use data in a business forecasting model, a machine learning (AI) applicaton, or a more complex scientific model. All of the above require complete datasets, and naively discarding missing data points amounts to an excessive waste of information. In this example we are continuing the example a not-so-easy to find public dataset.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-in-the-previous-blogpostpost2021-11-08-indicator_findable-we-explained-how-we-added-value-by-documenting-data-following-the-fair-principle-and-with-the-professional-curatorial-work-of-placing-the-data-in-context-and-linking-it-to-other-information-sources-such-as-other-datasets-books-and-publications-regardless-of-their-natural-language-ie-whether-these-sources-are-described-in-english-german-portugese-or-croatian-photo-jack-sloophttpsunsplashcomphotoseywn81spkj8">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="[In the previous blogpost](/post/2021-11-08-indicator_findable/) we explained how we added value by documenting data following the *FAIR* principle and with the professional curatorial work of placing the data in context, and linking it to other information sources, such as other datasets, books, and publications, regardless of their natural language (i.e., whether these sources are described in English, German, Portugese or Croatian). Photo: [Jack Sloop](https://unsplash.com/photos/eYwn81sPkJ8)." srcset="
/media/img/blogposts_2021/jack-sloop-eYwn81sPkJ8-unsplash_hu5d8f4a33b381dd8129d8c252a87ed0b3_4139695_6a66eba35e6a6a2451d2c0626a8d8b06.webp 400w,
/media/img/blogposts_2021/jack-sloop-eYwn81sPkJ8-unsplash_hu5d8f4a33b381dd8129d8c252a87ed0b3_4139695_7bf7f315b42bd4ba96d06a7c705ba035.webp 760w,
/media/img/blogposts_2021/jack-sloop-eYwn81sPkJ8-unsplash_hu5d8f4a33b381dd8129d8c252a87ed0b3_4139695_1200x1200_fit_q75_h2_lanczos.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/jack-sloop-eYwn81sPkJ8-unsplash_hu5d8f4a33b381dd8129d8c252a87ed0b3_4139695_6a66eba35e6a6a2451d2c0626a8d8b06.webp"
width="760"
height="507"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
&lt;a href="https://reprex-next.netlify.app/post/2021-11-08-indicator_findable/">In the previous blogpost&lt;/a> we explained how we added value by documenting data following the &lt;em>FAIR&lt;/em> principle and with the professional curatorial work of placing the data in context, and linking it to other information sources, such as other datasets, books, and publications, regardless of their natural language (i.e., whether these sources are described in English, German, Portugese or Croatian). Photo: &lt;a href="https://unsplash.com/photos/eYwn81sPkJ8" target="_blank" rel="noopener">Jack Sloop&lt;/a>.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>Completing missing datapoints requires statistical production information (why might the data be missing?) and data science knowhow (how to impute the missing value.) If you do not have a good statistician or data scientist in your team, you will need high-quality, complete datasets. This is what our automated data observatories provide.&lt;/p>
&lt;details class="toc-inpage d-print-none " open>
&lt;summary class="font-weight-bold">Table of Contents&lt;/summary>
&lt;nav id="TableOfContents">
&lt;ul>
&lt;li>&lt;a href="#why-is-data-missing">Why is data missing?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#what-can-we-improve">What can we improve?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#can-you-trust-our-data">Can you trust our data?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#avoid-the-data-sisyphus">Avoid the data Sisyphus&lt;/a>&lt;/li>
&lt;li>&lt;a href="#get-the-data">Get the data&lt;/a>&lt;/li>
&lt;li>&lt;a href="#how-can-we-do-better">How can we do better?&lt;/a>&lt;/li>
&lt;/ul>
&lt;/nav>
&lt;/details>
&lt;h2 id="why-is-data-missing">Why is data missing?&lt;/h2>
&lt;p>International organizations offer many statistical products, but usually they are on an ‘as-is’ basis. For example, Eurostat is the world’s premiere statistical agency, but it has no right to overrule whatever data the member states of the European Union, and some other cooperating European countries give to them. And they cannot force these countries to hand over data if they fail to do so. As a result, there will be many data points that are missing, and often data points that have wrong (obsolete) descriptions or geographical dimensions. We will show the geographical aspect of the problem in a separate blogpost; for now, we only focus on missing data.&lt;/p>
&lt;p>Some countries have only recently started providing data to the Eurostat umbrella organization, and it is likely that you will find few datapoints for North Macedonia or Bosnia-Herzegovina. Other countries provide data with some delay, and the last one or two years are missing. And there are gaps in some countries’ data, too.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-see-the-authoritative-copy-of-the-datasethttpszenodoorgrecord5652118yykhvmdmkuk">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="See the authoritative copy of the [dataset](https://zenodo.org/record/5652118#.YYkhVmDMKUk)." srcset="
/media/img/blogposts_2021/trb_plot_hu2f07a4d8566fea4aefe16ab33a0f6ff8_386734_61f5b96b14ca649585f96612d0148277.webp 400w,
/media/img/blogposts_2021/trb_plot_hu2f07a4d8566fea4aefe16ab33a0f6ff8_386734_f9c7c983b2d12bac8c235d8f74c64b48.webp 760w,
/media/img/blogposts_2021/trb_plot_hu2f07a4d8566fea4aefe16ab33a0f6ff8_386734_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/trb_plot_hu2f07a4d8566fea4aefe16ab33a0f6ff8_386734_61f5b96b14ca649585f96612d0148277.webp"
width="760"
height="507"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
See the authoritative copy of the &lt;a href="https://zenodo.org/record/5652118#.YYkhVmDMKUk" target="_blank" rel="noopener">dataset&lt;/a>.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>This is a headache if you want to use the data in some machine learning application or in a multiple or panel regression model. You can, of course, discard countries or years where you do not have full data coverage, but this approach usually wastes too much information&amp;ndash;if you work with 12 years, and only one data point is available, you would be discarding an entire country’s 11-years’ worth of data. Another option is to estimate the values, or otherwise impute the missing data, when this is possible with reasonable precision. This is where things get tricky, and you will likely need a statistician or a data scientist onboard.&lt;/p>
&lt;h2 id="what-can-we-improve">What can we improve?&lt;/h2>
&lt;p>Consider that the data is only missing from one year for a particular country, 2015. The naive solution would be to omit 2015 or the country at hand from the dataset. This is pretty destructive, because we know a lot about the radio market turnover in this country and in this year! But leaving 2015 blank will not look good on a chart, and will make your machine learning application or your regression model stop.&lt;/p>
&lt;p>A statistician or a radio market expert will tell you that you know more-or-less the missing information: the total turnover was certainly not zero in that year. With some statistical or radio domain-specific knowledge you will use the 2014, or 2016 value, or a combination of the two and keep the country and year in the dataset.&lt;/p>
&lt;p>Our improved dataset added backcasted (using the best time series model fitting the country&amp;rsquo;s actually present data), forecasted (again, using the best time series model), and approximated data (using linear approximation.) In a few cases, we add the last or next known value. To give a few quantiative indicators about our work:&lt;/p>
&lt;ul>
&lt;li>Increased number of observations: 65%&lt;/li>
&lt;li>Reduced missing values: -48.1%&lt;/li>
&lt;li>Increased non-missing subset for regression or AI: +66.67%&lt;/li>
&lt;/ul>
&lt;p>If your organization is working with panel (longitudional multiple) regressions or various machine learning applications, then your team knows that not havint the +66.67% gain would be a deal-breaker in the choice of models and punctuality of estimates or KPIs or other quantiative products. And that they would spent about 90% of their data resources on achieving this +66.67% gain in usability.&lt;/p>
&lt;p>If you happen to work in an NGO, a business unit or a research institute that does not employ data scientists, then it is likely that you can never achieve this improvement, and you have to give up on a number of quantitative tools or visualizations. If you have a data scientist onboard, that professional can use our work as a starting point.&lt;/p>
&lt;h2 id="can-you-trust-our-data">Can you trust our data?&lt;/h2>
&lt;p>We believe that you can trust our data better than the original public source. We use statistical expertise to find out why data may be missing. Often, it is present in a wrong location (for example, the name of a region changed.)&lt;/p>
&lt;p>If you are reluctant to use estimates, think about discarding known actual data from your forecast or visualization, because one data point is missing. How do you provide more accurate information? By hiding known actual data, because one point is missing, or by using all known data and an estimate?&lt;/p>
&lt;p>Our codebooks and our API uses the &lt;a href="https://sdmx.org/?page_id=3215/" target="_blank" rel="noopener">Statistical Data and Metadata eXchange&lt;/a> documentation standards to clearly indicate which data is observed, which is missing, which is estimated, and of course, also how it is estimated.
This example highlights another important aspect of data trustworthiness. If you have a better idea, you can replace them with a better estimate.&lt;/p>
&lt;p>Our indicators come with standardized codebooks that do not only contain the descriptive metadata, but administrative metadata about the history of the indicator values. You will find very important information about the statistical method we used the fill in the data gaps, and even link the reliable, the peer-reviewed scientific, statistical software that made the calculations. For data scientists, we record the plenty of information about the computing environment, too-–this can come handy if your estimates need external authentication, or you suspect a bug.&lt;/p>
&lt;h2 id="avoid-the-data-sisyphus">Avoid the data Sisyphus&lt;/h2>
&lt;p>If you work in an academic institution, in an NGO or a consultancy, you can never be sure who downloaded the &lt;a href="https://appsso.eurostat.ec.europa.eu/nui/show.do?dataset=sbs_na_1a_se_r2&amp;amp;lang=en" target="_blank" rel="noopener">Annual detailed enterprise statistics for services (NACE Rev. 2 H-N and S95)&lt;/a> Eurostat folder from Eurostat. Did they modify the dataset? Did they already make corrections with the missing data? What method did they use? To prevent many potential problems, you will likely download it again, and again, and again&amp;hellip;&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-see-our-the-data-sisyphushttpsreprexnlpost2021-07-08-data-sisyphus-blogpost">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="See our [The Data Sisyphus](https://reprex.nl/post/2021-07-08-data-sisyphus/) blogpost." srcset="
/media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_cd48a6c374c9ff68a08abe79a6abf2f4.webp 400w,
/media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_a6eb1b13ff33a5c73aba34550964ff52.webp 760w,
/media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_cd48a6c374c9ff68a08abe79a6abf2f4.webp"
width="760"
height="507"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
See our &lt;a href="https://reprex.nl/post/2021-07-08-data-sisyphus/" target="_blank" rel="noopener">The Data Sisyphus&lt;/a> blogpost.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>We have a better solution. You can always rely on our API to import directly the latest, best data, but if you want to be sure, you can use our &lt;a href="https://zenodo.org/record/5652118#.YYhGOGDMLIU" target="_blank" rel="noopener">regular backups&lt;/a> on Zenodo. Zenodo is an open science repository managed by CERN and supported by the European Union. On Zenodo, you can find an authoritative copy of our indicator (and its previous versions) with a digital object identifier, in this case, &lt;a href="https://doi.org/10.5281/zenodo.5652118" target="_blank" rel="noopener">10.5281/zenodo.5652118&lt;/a>. These datasets will be preserved for decades, and nobody can manipulate them. You cannot accidentally overwrite them, and we have no backdoor access to modify them.&lt;/p>
&lt;h2 id="get-the-data">Get the data&lt;/h2>
&lt;p>&lt;a href="https://doi.org/10.5281/zenodo.5652118" target="_blank" rel="noopener">
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://zenodo.org/badge/DOI/10.5281/zenodo.5652118.svg" alt="DOI" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/a>&lt;/p>
&lt;h2 id="how-can-we-do-better">How can we do better?&lt;/h2>
&lt;details class="spoiler " id="spoiler-4">
&lt;summary>Are you a data user?&lt;/summary>
&lt;p>&lt;em>Shall we do some further automatic data enhancements with our datasets? Document with different metadata? Link more information for business, policy, or academic use? Please get in touch with &lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">us&lt;/a>!&lt;/em>&lt;/p>
&lt;/details></description></item><item><title>Reprex Joins RECREO Research Consortium To Develop Innovation Indicators</title><link>https://reprex-next.netlify.app/post/2021-10-06-recreo/</link><pubDate>Sat, 06 Nov 2021 16:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-10-06-recreo/</guid><description>&lt;p>The &lt;a href="https://www.santannapisa.it/it" target="_blank" rel="noopener">Scuola Superiore di Studi Universitari e di Perfezionamento Sant’Anna&lt;/a> and &lt;a href="https://www.unitn.it/en" target="_blank" rel="noopener">Università degli Studi di Trento&lt;/a> (Italy); &lt;a href="https://www.create.ac.uk/" target="_blank" rel="noopener">University of Glasgow&lt;/a> (United Kingdom); &lt;a href="https://www.ivir.nl/" target="_blank" rel="noopener">Universiteit van Amsterdam&lt;/a> and &lt;a href="https://pro.europeana.eu/" target="_blank" rel="noopener">Stichting Europeana&lt;/a> from the Netherlands; the &lt;a href="https://www.maynoothuniversity.ie/" target="_blank" rel="noopener">National University of Ireland Maynooth&lt;/a> (Ireland); &lt;a href="https://www.ut.ee/en/" target="_blank" rel="noopener">Tartu Ulikool&lt;/a> (Estonia); &lt;a href="https://u-szeged.hu/" target="_blank" rel="noopener">Szegedi Tudományegyetem&lt;/a> (Hungary); &lt;a href="https://www.santamarialareal.org/" target="_blank" rel="noopener">Fundacion Santa Maria La Real del Patrimonio Historico&lt;/a> from Spain; the &lt;a href="https://www.kuleuven.be/kuleuven/" target="_blank" rel="noopener">Katholieke Universiteit Leuven&lt;/a>, (Belgium); &lt;a href="https://cultureactioneurope.org/" target="_blank" rel="noopener">Culture Action Europe AISBL&lt;/a> and &lt;a href="https://www.ideaconsult.be/en/" target="_blank" rel="noopener">IDEA Strategische Economische Consulting&lt;/a> (Belgium) and &lt;a href="https://reprex.nl/" target="_blank" rel="noopener">Reprex&lt;/a> created the&lt;code>REshaping CCSI REsearch: Open data, policy analysis and methods for evidence-based decision-making consortium&lt;/code> consortium, which will mainly develop new policy evidence in the field of innovation and inclusiveness for the creative and cultural sectors, industries. The Consortium applies for a Horizon Europe grant with the &lt;code>HORIZON-CL2-2021-HERITAGE-01-03&lt;/code> &lt;a href="https://ec.europa.eu/info/funding-tenders/opportunities/portal/screen/opportunities/topic-details/horizon-cl2-2021-heritage-01-03" target="_blank" rel="noopener">Cultural and creative industries as a driver of innovation and competitiveness&lt;/a> call of the European Commission.&lt;/p>
&lt;p>Policymakers face challenges when trying to implement a strict evidence-based approach to decision-making in the field of cultural and creative sectors and industries (CCSI). This is mostly due to four phenomena:&lt;/p>
&lt;ol>
&lt;li>&lt;code>Evidence dissonances in mapping, measuring and analysis of key indicators&lt;/code>, which lead to improper generalizations and gaps in decisionmakers’ knowledge and stakeholders’ awareness&lt;/li>
&lt;li>&lt;code>Fragmentation of hubs of production and concentration of platforms&lt;/code>, which create statistical biases and have features that hardly fit with traditional impact assessment methods;&lt;/li>
&lt;li>&lt;code>Datafication&lt;/code>, which is revolutionizing CCSI but remains difficult to investigate, thus broadening knowledge gaps; and&lt;/li>
&lt;li>&lt;code>Stakeholders’ fragmentation and conflicting interests&lt;/code>, which hinders their engagement, awareness-raising and uptake of policy inputs.With its cross-disciplinary consortium of academics, practitioners and a strong network of stakeholders, engaged via participatory research strategies, RECREO will help policymakers and stakeholders tackling such challenges, by generating new knowledge and methods to fill in knowledge and awareness gaps. RECREO will achieve this goal through four actions.&lt;/li>
&lt;/ol>
&lt;p>First, it will generate a wide array of horizontal and sector-specific datasets, made openly accessible via the &lt;a href="https://reprex.nl/project/ccsi-data-observatory/" target="_blank" rel="noopener">CCSI Data Observatory&lt;/a> and the Evidence Synthesis Platform. Second, it will offer an unprecedented EU and comparative mapping and impact assessment of key regulatory and policy measures relevant for CCSI, made available on the Law and Policy Observatory. Third, it will develop innovative methods to measure and assess CCSI innovation, competitiveness and spill-over effects, emphasizing inclusiveness, diversity and sustainability. Last, it will offer policy recommendations and best practices aimed at supporting the sustainable growth and competitiveness of culturally diverse CCSI, and their cross-fertilization with cultural heritage promotion and preservation.&lt;/p></description></item><item><title>Reprex on MaMA</title><link>https://reprex-next.netlify.app/post/2021-10-15-mama/</link><pubDate>Fri, 15 Oct 2021 19:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-10-15-mama/</guid><description>&lt;p>Reprex’s co-founder, the main developer of the Digital Music Observatory, &lt;a href="https://reprex.nl/authors/daniel_antal/" target="_blank" rel="noopener">Daniel Antal&lt;/a> and Digital Music Observatory curator, &lt;a href="https://music.dataobservatory.eu/author/marie-zhorova/" target="_blank" rel="noopener">Marie Zhorová&lt;/a> participated in the MaMA Festival &amp;amp; Convention in Paris on 13-15 October within the &lt;a href="https://www.jumpmusic.eu/fellows/" target="_blank" rel="noopener">JUMP Music Market Accelerator Program&lt;/a> Program. We introduced our Digital Music Observatory to national music organizations and encouraged them to try out a cooperation with us. (See Use Cases below)&lt;/p>
&lt;p>Our main aim was to find new users to our Digital Music Observatory, and to find partners for a future Horizon Europe R&amp;amp;D project to develop the scientific pillars of the Observatory in a manner that meets practical industry needs and the feature requirements laid out in hte Feasiblity Study for a Euroepan Music Observatory.&lt;/p>
&lt;p>Our concept was introduced in Le Trianon to a wider audience during the &lt;a href="https://reprex.nl/talk/digital-music-observatory-on-the-mama-convention-2021/" target="_blank" rel="noopener">JUMP Music Market Accelerator Pitch Session&lt;/a> and in one-to-one meetings to representatives of French national organizations. We have also started to investigate the possibility to cooperate with two startups to bring our data services closer to artists, labels, and publishers.&lt;/p>
&lt;script class="lesondier-widget" data-ls-event-id="12386" data-ls-site-slug="mama" src="https://live.mamafestival.com/build/widget/widget_loader.min.js" data-ls-width="600px" data-ls-height="435px" async>&lt;/script>
&lt;h2 id="use-cases">Use Cases&lt;/h2>
&lt;h3 id="fair-streaming">Fair Streaming&lt;/h3>
&lt;td style="text-align: center;">
&lt;figure id="figure-daniel-introduced-our-work-made-for-the-uk-ipos-music-creators-earnings-in-the-digital-era-projecthttpsmusicdataobservatoryeupublicationmce_empirical_streaming_2021-about-the-justified-and-not-justified-differences-among-music-rightsholders-earnings-and-the-diminishing-market-value-of-streams-we-believe-that-our-uk-approach-is-a-particularly-interesting-addition-to-join-with-the-distribution-analysishttpsdataandlyricscompost2021-02-21-cnm-streaming-performed-by-the-centre-nationale-de-la-musiquehttpscnmfren-and-deloitte-in-france">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Daniel introduced our work made for the UK IPO&amp;#39;s [Music Creators&amp;#39; Earnings in the Digital Era Project](https://music.dataobservatory.eu/publication/mce_empirical_streaming_2021/) about the justified and not-justified differences among music rightsholders earnings and the diminishing market value of streams. We believe that our UK approach is a particularly interesting addition to join with [the distribution analysis](https://dataandlyrics.com/post/2021-02-21-cnm-streaming/) performed by the [Centre Nationale de la Musique](https://cnm.fr/en/) and Deloitte in France." srcset="
/media/img/reports/mce/featured_hue50608b3f52d6750042187fb48482821_711194_82c4b196d92bab0818d4a4e473f93c67.webp 400w,
/media/img/reports/mce/featured_hue50608b3f52d6750042187fb48482821_711194_9755d01f37d217b33040e55332846897.webp 760w,
/media/img/reports/mce/featured_hue50608b3f52d6750042187fb48482821_711194_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/reports/mce/featured_hue50608b3f52d6750042187fb48482821_711194_82c4b196d92bab0818d4a4e473f93c67.webp"
width="760"
height="526"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Daniel introduced our work made for the UK IPO&amp;rsquo;s &lt;a href="https://music.dataobservatory.eu/publication/mce_empirical_streaming_2021/" target="_blank" rel="noopener">Music Creators&amp;rsquo; Earnings in the Digital Era Project&lt;/a> about the justified and not-justified differences among music rightsholders earnings and the diminishing market value of streams. We believe that our UK approach is a particularly interesting addition to join with &lt;a href="https://dataandlyrics.com/post/2021-02-21-cnm-streaming/" target="_blank" rel="noopener">the distribution analysis&lt;/a> performed by the &lt;a href="https://cnm.fr/en/" target="_blank" rel="noopener">Centre Nationale de la Musique&lt;/a> and Deloitte in France.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;h3 id="fair-value">Fair Value&lt;/h3>
&lt;td style="text-align: center;">
&lt;figure id="figure-daniel-introduced-to-collective-management-professioanls-our-innovative-approach-for-private-copying-valuation-royalty-price-setting-estimating-the-values-of-value-transfer-to-media-platforms-and-other-topics-of-interests-for-collective-management-and-rights-management-organizations-our-approach-has-a-proven-track-record-to-increase-revenues-for-creators">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Daniel introduced to collective management professioanls our innovative approach for private copying valuation, royalty price setting, estimating the values of value transfer to media platforms, and other topics of interests for collective management and rights management organizations. Our approach has a proven track record to increase revenues for creators." srcset="
/media/img/reports/mce/listen_fair_treemap_en_hu541e296fb926ca7602368e34164d282a_326052_da197cb6040c006765b23dec7d5086e1.webp 400w,
/media/img/reports/mce/listen_fair_treemap_en_hu541e296fb926ca7602368e34164d282a_326052_d0d3fea66be53635f73bc1425195a568.webp 760w,
/media/img/reports/mce/listen_fair_treemap_en_hu541e296fb926ca7602368e34164d282a_326052_1200x1200_fit_q75_h2_lanczos.webp 1200w"
src="https://reprex-next.netlify.app/media/img/reports/mce/listen_fair_treemap_en_hu541e296fb926ca7602368e34164d282a_326052_da197cb6040c006765b23dec7d5086e1.webp"
width="760"
height="608"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Daniel introduced to collective management professioanls our innovative approach for private copying valuation, royalty price setting, estimating the values of value transfer to media platforms, and other topics of interests for collective management and rights management organizations. Our approach has a proven track record to increase revenues for creators.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;h3 id="open-music-observatory">Open Music Observatory&lt;/h3>
&lt;td style="text-align: center;">
&lt;figure id="figure-we-introduced-our-approach-to-building-the-european-music-observatoryhttpsmusicdataobservatoryeupost2021-03-04-jump-2021-in-a-decentralized-way-relying-not-only-on-the-resources-of-creative-europe-but-also-on-open-science-horizon-europe-bringing-the-music-industry-music-research-in-universities-and-cultural-policy-under-one-open-collaboration-because-france-is-building-its-own-music-observatory-of-a-kind-the-decentralized-approach-could-particularly-benefit-french-stakeholders">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="We introduced our approach to building the [European Music Observatory](https://music.dataobservatory.eu/post/2021-03-04-jump-2021/) in a decentralized way, relying not only on the resources of Creative Europe but also on Open Science, Horizon Europe, bringing the music industry, music research in universities and cultural policy under one open collaboration. Because France is building its own music observatory of a kind, the decentralized approach could particularly benefit French stakeholders." srcset="
/media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_a07a8e618fa7317f6f8256b9a334262e.webp 400w,
/media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_3a4ae7f72478fd880961b08e1f7075dd.webp 760w,
/media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_a07a8e618fa7317f6f8256b9a334262e.webp"
width="760"
height="427"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
We introduced our approach to building the &lt;a href="https://music.dataobservatory.eu/post/2021-03-04-jump-2021/" target="_blank" rel="noopener">European Music Observatory&lt;/a> in a decentralized way, relying not only on the resources of Creative Europe but also on Open Science, Horizon Europe, bringing the music industry, music research in universities and cultural policy under one open collaboration. Because France is building its own music observatory of a kind, the decentralized approach could particularly benefit French stakeholders.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;h3 id="listen-local">Listen Local&lt;/h3>
&lt;td style="text-align: center;">
&lt;figure id="figure-marie-and-daniel-introduced-the-listen-local-projecthttpsreprexnlprojectlisten-local-to-startups-our-listen-local-project-analyzes-why-recommendation-engines-do-not-recommend-locally-relevant-music-such-as-music-from-paris-in-paris-slovakian-music-for-slovaks-and-offers-alternative-approaches-and-fixes--we-were-discussing-with-other-startups-serving-artists-and-small-labels-to-bring-down-our-macro-level-approaches-benefits-to-the-level-of-aritsts-as-we-did-in-our-experimental-project-in-slovakiahttpsmusicdataobservatoryeupublicationlisten_local_2020-supported-by-our-scientific-research-cooperation-see-our-pre-print-manuscripthttpsmusicdataobservatoryeupublicationeuropean_visibilitiy_2021">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Marie and Daniel introduced the [Listen Local project](https://reprex.nl/project/listen-local/) to startups. Our Listen Local project analyzes why recommendation engines do not recommend locally relevant music (such as music from Paris in Paris, Slovakian music for Slovaks) and offers alternative approaches and fixes. We were discussing with other startups serving artists and small labels to bring down our macro-level approaches&amp;#39; benefits to the level of aritsts, as we did in our experimental project in [Slovakia](https://music.dataobservatory.eu/publication/listen_local_2020/) supported by our scientific research cooperation (see our pre-print [manuscript](https://music.dataobservatory.eu/publication/european_visibilitiy_2021/).)" srcset="
/media/img/reports/listen_local_2020/listen_local_study_covers_hue3bbdd36723034473d5308625670dcc8_550932_763efe9ed9c592524de77426709a3c4d.webp 400w,
/media/img/reports/listen_local_2020/listen_local_study_covers_hue3bbdd36723034473d5308625670dcc8_550932_03e3303a405b457472b9b2ff5bc4c0d4.webp 760w,
/media/img/reports/listen_local_2020/listen_local_study_covers_hue3bbdd36723034473d5308625670dcc8_550932_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/reports/listen_local_2020/listen_local_study_covers_hue3bbdd36723034473d5308625670dcc8_550932_763efe9ed9c592524de77426709a3c4d.webp"
width="760"
height="507"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Marie and Daniel introduced the &lt;a href="https://reprex.nl/project/listen-local/" target="_blank" rel="noopener">Listen Local project&lt;/a> to startups. Our Listen Local project analyzes why recommendation engines do not recommend locally relevant music (such as music from Paris in Paris, Slovakian music for Slovaks) and offers alternative approaches and fixes. We were discussing with other startups serving artists and small labels to bring down our macro-level approaches&amp;rsquo; benefits to the level of aritsts, as we did in our experimental project in &lt;a href="https://music.dataobservatory.eu/publication/listen_local_2020/" target="_blank" rel="noopener">Slovakia&lt;/a> supported by our scientific research cooperation (see our pre-print &lt;a href="https://music.dataobservatory.eu/publication/european_visibilitiy_2021/" target="_blank" rel="noopener">manuscript&lt;/a>.)
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;h2 id="why-data-observatory">Why Data Observatory?&lt;/h2>
&lt;p>Our use cases highlight the value of having a wide range of data available for the industry players, researchers and policy-makers. In the era of big data, and when open data is becoming &lt;em>legally&lt;/em> more and more available, it is important to have one place with a single data collection method. Copernicus built a permanent observatory for the ongoing observation of celestial bodies. We built an automated data observatory to permanently collect data about music.&lt;/p></description></item><item><title>CCSI Data Observatory</title><link>https://reprex-next.netlify.app/post/2021-10-05-ccsi/</link><pubDate>Wed, 06 Oct 2021 16:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-10-05-ccsi/</guid><description>&lt;p>The creative and cultural sectors and industries are mainly made of networks of freelancers and microenterprises, with very few medium-sized companies. Their economic performance, problems, and innovation capacities hidden. Our open collaboration to create this data observatory is committed to change this. Relying on modern data science, the re-use of open governmental data, open science data, and novel harmonized data collection we aim to fill in the gaps left in the official statistics of the European Union.&lt;/p>
&lt;p>We believe that introducing Open Policy Analysis standards with open data, open-source software and research automation can help better understanding how creative people and their enterprises and institutions add value to the European economy, how they create jobs, innovate, and increase the well-being of a diverse European society. Our collaboration is open for individuals, citizens scientists.&lt;/p>
&lt;p>The new observatory can be reached on &lt;a href="https://ccsi.dataobservatory.eu/" target="_blank" rel="noopener">ccsi.dataobservatory.eu&lt;/a> and will be institutionally hosted by &lt;a href="https://www.ivir.nl/" target="_blank" rel="noopener">IViR&lt;/a>, the &lt;em>Institute for Information Law&lt;/em> of the University of Amsterdam, where Reprex’s co-founder, Daniel Antal will coordinate the development of this new, open scientific tool. Reprex will continue to develop the working model of the data observatory and continue to build open source software tools within the &lt;a href="http://ropengov.org/" target="_blank" rel="noopener">rOpenGov community&lt;/a> and the &lt;a href="https://ropengov.r-universe.dev/" target="_blank" rel="noopener">R-Universe&lt;/a> initative of ROpenSci.&lt;/p>
&lt;p>The &lt;a href="https://www.santannapisa.it/it" target="_blank" rel="noopener">Scuola Superiore di Studi Universitari e di Perfezionamento Sant’Anna&lt;/a> and &lt;a href="https://www.unitn.it/en" target="_blank" rel="noopener">Università degli Studi di Trento&lt;/a> (Italy); &lt;a href="https://www.create.ac.uk/" target="_blank" rel="noopener">University of Glasgow&lt;/a> (United Kingdom); &lt;a href="https://www.ivir.nl/" target="_blank" rel="noopener">Universiteit van Amsterdam&lt;/a> and &lt;a href="https://pro.europeana.eu/" target="_blank" rel="noopener">Stichting Europeana&lt;/a> from the Netherlands; the &lt;a href="https://www.maynoothuniversity.ie/" target="_blank" rel="noopener">National University of Ireland Maynooth&lt;/a> (Ireland); &lt;a href="https://www.ut.ee/en/" target="_blank" rel="noopener">Tartu Ulikool&lt;/a> (Estonia); &lt;a href="https://u-szeged.hu/" target="_blank" rel="noopener">Szegedi Tudományegyetem&lt;/a> (Hungary); &lt;a href="https://www.santamarialareal.org/" target="_blank" rel="noopener">Fundacion Santa Maria La Real del Patrimonio Historico&lt;/a> from Spain; the &lt;a href="https://www.kuleuven.be/kuleuven/" target="_blank" rel="noopener">Katholieke Universiteit Leuven&lt;/a>, (Belgium); &lt;a href="https://cultureactioneurope.org/" target="_blank" rel="noopener">Culture Action Europe AISBL&lt;/a> and &lt;a href="https://www.ideaconsult.be/en/" target="_blank" rel="noopener">IDEA Strategische Economische Consulting&lt;/a> (Belgium) and Reprex created the the &lt;code>RECREO&lt;/code> consortium, which will mainly develop new policy evidence in the field of innovation and inclusiveness for the creative and cultural sectors, industries. The Consortium applies for a Horizon Europe grant with the &lt;code>HORIZON-CL2-2021-HERITAGE-01-03&lt;/code> &lt;a href="https://ec.europa.eu/info/funding-tenders/opportunities/portal/screen/opportunities/topic-details/horizon-cl2-2021-heritage-01-03" target="_blank" rel="noopener">Cultural and creative industries as a driver of innovation and competitiveness&lt;/a> call of the European Commission.&lt;/p></description></item><item><title>Research &amp; Analysis: Music Creators’ Earnings in the Digital Era</title><link>https://reprex-next.netlify.app/post/2021-09-23-mce_reports/</link><pubDate>Thu, 23 Sep 2021 08:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-09-23-mce_reports/</guid><description>&lt;p>Reprex with its &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observatory team&lt;/a> was commissioned to prepare an analysis on the justified and not justified differences in music creators’ earnings. We have posted our most important findings in an earlier blogpost (&lt;a href="https://music.dataobservatory.eu/post/2021-06-18-mce/" target="_blank" rel="noopener">Music Creators’ Earnings in the Streaming Era. United Kingdom Research Cooperation With the Digital Music Observatory&lt;/a>.&lt;/p>
&lt;p>The UK Intellectual Property Office has published the entire report on the music creators’ earnings, and we have made our detailed analysis available in a side-publication. Reprex also signed an agreement with the researchers of the Music Creators’ Earnings project to deposit all data published in the report in the Digital Music Observatory, and to promote the building of the observatory further.&lt;/p>
&lt;p>The research questions asked in this report are related to the &lt;a href="https://www.gov.uk/government/publications/music-creators-earnings-in-the-digital-era" target="_blank" rel="noopener">Music Creator Earnings&amp;rsquo; Project&lt;/a> (MCE), exploring issues concerning equitable remuneration and earnings distributions. We were tasked with providing a longitudinal analysis of earnings development and relating our findings to equitable remuneration. The starting point of our work was centred around a very broadly defined problem: how much money music creators (rightsholders) earn from streaming, how these earnings are distributed, and how the earnings and their distribution have developed during the last decade.&lt;/p>
&lt;p>The highly globalized music industry generates two important international reports, as well as several national reports, but these are not suitable for the analysis of the typical or average rightsholder, nor for small labels and publishers who do not represent a large and internationally diversified portfolio of music works or recordings. Copyright and neighboring right revenues are collected in national jurisdictions. Because British artists are almost never constrained by their use of language, and the UK Music Industry is highly competitive in the global music markets, even relatively less known rightsholders earn revenues from dozens of national markets. The lack of market information on music sales volumes, prices for each jurisdiction, and the unaccounted for national, domestic, and foreign revenues makes the analysis of the rightholder’s earnings, or the economics of a certain distribution channel like music streaming or media platforms, impossible.&lt;/p>
&lt;figure id="figure-the-effect-of-international-diversification-on-revenues---a-combination-of-international-price-differences-and-exchange-rate-fluctuations">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/media/img/reports/mce/Effect_International_Diversification_Revenues_Coplot.png" alt="The Effect of International Diversification on Revenues - a combination of international price differences and exchange rate fluctuations." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
The Effect of International Diversification on Revenues - a combination of international price differences and exchange rate fluctuations.
&lt;/figcaption>&lt;/figure>
&lt;p>While total earnings are reported by international and national organizations, they hide five important economic variables: changes in sales volumes, changes in prices, market share on various national jurisdictions (which have their own volume and price movements), the exchange rates applied, and the share of the repertoire exploited. Even worse, the global music industry has no comprehensive database of rightsholders, music works, and recordings – this is the data gap that we would like fill with the Digital Music Observatory.&lt;/p>
&lt;p>Our &lt;a href="https://mce.dataobservatory.eu/" target="_blank" rel="noopener">report&lt;/a> highlights some important lessons. First, we show that in the era of global music sales platforms it is impossible to understand the economics of music streaming without international data harmonization and advanced surveying and sampling. Paradoxically, without careful adjustments for accruals, market shares in jurisdictions, and disaggregation of price and volume changes, the British industry cannot analyze its own economics because of its high level of integration to the global music economy. Furthermore, the replacement of former public performances, mechanical licensing, and private copying remunerations (which has been available for British rightsholders in their European markets for decades) with less valuable streaming licenses has left many rightsholders poorer. Making adjustments on the distribution system without modifying the definition of equitable remuneration rights or the pro-rata distribution scheme of streaming platforms opens up many conflicts while solving not enough fundamental problems. Therefore, we suggest participation in international data harmonization and policy coordination to help regain the historical value of music.&lt;/p>
&lt;h2 id="context">Context&lt;/h2>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/media/img/blogposts_20121/dcms_economics_music_streaming.png" alt="" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;p>The idea of our Digital Music Observatory was brought to the UK policy debate on music streaming by the &lt;em>Written evidence submitted by The state51 Music Group&lt;/em> to the &lt;em>Economics of music streaming review&lt;/em> of the UK Parliaments&amp;rsquo; DCMS Committee&lt;sup id="fnref:1">&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref">1&lt;/a>&lt;/sup>.&lt;/p>
&lt;p>The music industry requires a permanent market monitoring facility to win fights in competition tribunals, because it is increasingly disputing revenues with the world’s biggest data owners. This was precisely the role of the former CEEMID&lt;sup id="fnref:2">&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref">2&lt;/a>&lt;/sup> program, which was initiated by a group of collective management societies. Starting with three relatively data-poor countries, where data pooling allowed rightsholders to increase revenues, the CEEMID data collection program was extended in 2019 to 12 countries.The &lt;a href="https://ceereport2020.ceemid.eu/" target="_blank" rel="noopener">final regional report&lt;/a>, after the release of the detailed &lt;a href="https://music.dataobservatory.eu/publication/hungary_music_industry_2014/" target="_blank" rel="noopener">Hungarian&lt;/a>, &lt;a href="https://music.dataobservatory.eu/publication/slovak_music_industry_2019/" target="_blank" rel="noopener">Slovak&lt;/a> and &lt;a href="https://music.dataobservatory.eu/publication/private_copying_croatia_2019/" target="_blank" rel="noopener">Croatian reports&lt;/a> of CEEMID was sponsored by Consolidated Independent (of the &lt;em>state51 music group&lt;/em>.)&lt;/p>
&lt;p>CEEMID was eventually to formed into the &lt;em>Demo Music Observatory&lt;/em> in 2020&lt;sup id="fnref:3">&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref">3&lt;/a>&lt;/sup>, following the planned structure of the &lt;a href="https://dataandlyrics.com/post/2020-11-16-european-music-observatory-feasibility/" target="_blank" rel="noopener">European Music Observatory&lt;/a>, and validated in the world&amp;rsquo;s 2nd ranked university-backed incubator, the Yes!Delft AI+Blockchain Validation Lab. In 2021, under the final name &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observatory&lt;/a>, it became open for any rightsholder or stakeholder organization or music research institute, and it is being launched with the help of the &lt;a href="https://dataandlyrics.com/post/2021-03-04-jump-2021/" target="_blank" rel="noopener">JUMP European Music Market Accelerator Programme&lt;/a> which is co-funded by the Creative Europe Programme of the European Union.&lt;/p>
&lt;p>In December 2020, we started investigating how the music observatory concept could be introduced in the UK, and how our data and analytical skills could be used in the &lt;a href="https://digit-research.org/research/related-projects/music-creators-earnings-in-the-streaming-era/" target="_blank" rel="noopener">Music Creators’ Earnings in the Streaming Era&lt;/a> (in short: MCE) project, which is taking place paralell to the heated political debates around the DCMS inquiry. After the &lt;em>state51 music group&lt;/em> gave permission for the UK Intellectual Property Office to reuse the data that was originally published as the experimental &lt;a href="https://ceereport2020.ceemid.eu/market.html#recmarket" target="_blank" rel="noopener">CEEMID-CI Streaming Volume and Revenue Indexes&lt;/a>, we came to a cooperation agreement between the MCE Project and the &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observatory&lt;/a>. We provided a detailed historical analysis and computer simulation for the MCE Project, and we will host all the data of the &lt;em>Music Creators’ Earnings Report&lt;/em> in our observatory, hopefully no later than early July 2021.&lt;/p>
&lt;figure id="figure-the-digital-music-observatoryhttpsmusicdataobservatoryeu-contributes-to-the-music-creators-earnings-in-the-streaming-era-project-with-understanding-the-level-of-justified-and-unjustified-differences-in-rightsholder-earnings-and-putting-them-into-a-broader-music-economy-context">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/media/img/observatory_screenshots/dmo_opening_screen.png" alt="The [Digital Music Observatory](https://music.dataobservatory.eu/) contributes to the Music Creators’ Earnings in the Streaming Era project with understanding the level of justified and unjustified differences in rightsholder earnings, and putting them into a broader music economy context." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
The &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observatory&lt;/a> contributes to the Music Creators’ Earnings in the Streaming Era project with understanding the level of justified and unjustified differences in rightsholder earnings, and putting them into a broader music economy context.
&lt;/figcaption>&lt;/figure>
&lt;p>We started our cooperation with the two principal investigators of the project, &lt;a href="https://music.dataobservatory.eu/author/prof-david-hesmondhalgh/" target="_blank" rel="noopener">Prof David Hesmondhalgh&lt;/a> and &lt;a href="https://music.dataobservatory.eu/author/hyojung-sun/" target="_blank" rel="noopener">Dr Hyojugn Sun&lt;/a> back in April and will start releasing the findings and the data in July 2021.&lt;/p>
&lt;h2 id="join-us">Join us&lt;/h2>
&lt;p>&lt;em>Do you need high-quality data for your music business or institution? Are you a music researcher? Join our open collaboration Digital Music Observatory team as a &lt;a href="https://reprex-next.netlify.app/authors/curator">data curator&lt;/a>, &lt;a href="https://reprex-next.netlify.app/authors/developer">developer&lt;/a> or &lt;a href="https://reprex-next.netlify.app/authors/team">business developer&lt;/a>.&lt;/em>&lt;/p>
&lt;h2 id="footnote-references">Footnote References&lt;/h2>
&lt;section class="footnotes" role="doc-endnotes">
&lt;hr>
&lt;ol>
&lt;li id="fn:1" role="doc-endnote">
&lt;p>state51 Music Group. 2020. “Written Evidence Submitted by The state51 Music Group. Economics of Music Streaming Review. Response to Call for Evidence.” UK Parliament website. &lt;a href="https://committees.parliament.uk/writtenevidence/15422/html/" target="_blank" rel="noopener">https://committees.parliament.uk/writtenevidence/15422/html/&lt;/a>.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;li id="fn:2" role="doc-endnote">
&lt;p>Artisjus, HDS, SOZA, and Candole Partners. 2014. “Measuring and Reporting Regional Economic Value Added, National Income and Employment by the Music Industry in a Creative Industries Perspective. Memorandum of Understanding to Create a Regional Music Database to Support Professional National Reporting, Economic Valuation and a Regional Music Study.”&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;li id="fn:3" role="doc-endnote">
&lt;p>Antal, Daniel. 2021. “Launching Our Demo Music Observatory.” &lt;em>Data &amp;amp; Lyrics&lt;/em>. Reprex. &lt;a href="https://dataandlyrics.com/post/2020-09-15-music-observatory-launch/" target="_blank" rel="noopener">https://dataandlyrics.com/post/2020-09-15-music-observatory-launch/&lt;/a>.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;/ol>
&lt;/section></description></item><item><title>The Data Sisyphus</title><link>https://reprex-next.netlify.app/post/2021-07-08-data-sisyphus/</link><pubDate>Thu, 08 Jul 2021 09:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-07-08-data-sisyphus/</guid><description>&lt;td style="text-align: center;">
&lt;figure id="figure-sisyphus-was-punished-by-being-forced-to-roll-an-immense-boulder-up-a-hill-only-for-it-to-roll-down-every-time-it-neared-the-top-repeating-this-action-for-eternity--this-is-the-price-that-project-managers-and-analysts-pay-for-the-inadequate-documentation-of-their-data-assets">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Sisyphus was punished by being forced to roll an immense boulder up a hill only for it to roll down every time it neared the top, repeating this action for eternity. This is the price that project managers and analysts pay for the inadequate documentation of their data assets." srcset="
/media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_cd48a6c374c9ff68a08abe79a6abf2f4.webp 400w,
/media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_a6eb1b13ff33a5c73aba34550964ff52.webp 760w,
/media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_cd48a6c374c9ff68a08abe79a6abf2f4.webp"
width="760"
height="507"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
Sisyphus was punished by being forced to roll an immense boulder up a hill only for it to roll down every time it neared the top, repeating this action for eternity. This is the price that project managers and analysts pay for the inadequate documentation of their data assets.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>&lt;em>When was a file downloaded from the internet? What happened with it sense? Are their updates? Did the bibliographical reference was made for quotations? Missing values imputed? Currency translated? Who knows about it – who created a dataset, who contributed to it? Which is an intermediate format of a spreadsheet file, and which is the final, checked, approved by a senior manager?&lt;/em>&lt;/p>
&lt;p>Big data creates inequality and injustice. On aspect of this inequality is the cost of data processing and documentation – a greatly underestimated, and usually not reported cost item. In small organizations, where there are no separate data science and data engineering roles, data is usually supposed to be processed and documented by (junior) analysts or researchers. This a very important source of the gap between Big Tech and them: the data usually ends up very expensive, ill-formatted, not readable by computers that use machine learning and AI. Usually the documentation steps are completely omitted.&lt;/p>
&lt;blockquote>
&lt;p>“Data is potential information, analogous to potential energy: work is required to release it.” &amp;ndash; Jeffrey Pomerantz&lt;/p>
&lt;/blockquote>
&lt;p>Metadata, which is information about the history of the data, and information how it can be technically and legally reused, has a hidden cost. Cheap or low-quality external data comes with poor or no metadata, and small organizations lack the resources to add high-quality metadata to their datasets. However, this only perpetuates the problem.&lt;/p>
&lt;h2 id="metadata-unbillable-hours">The hidden cost item behind the unbillable hours&lt;/h2>
&lt;p>As we have shown with our research partners, such metadata problems are not unique to data analysis. Independent artists and small labels are suffering on music or book sales platforms, because their copyrighted content is not well documented. If you automatically document tens of thousands of songs or datasets, the documentation cost is very small per item. If you, do it manually, the cost may be higher than the expected revenue from the song, or the total cost of the dataset itself. (See our research consortiums&amp;rsquo; preprint paper: &lt;a href="https://dataandlyrics.com/publication/european_visibilitiy_2021/" target="_blank" rel="noopener">Ensuring the Visibility and Accessibility of European Creative Content on the World Market: The Need for Copyright Data Improvement in the Light of New Technologies&lt;/a>)&lt;/p>
&lt;p>In the short run, small consultancies, NGOs, or as a matter of fact, musicians, seem to logically give up on high-quality documentation and logging. In the long run, this has two devastating consequences: computers, such as machine learning algorithms cannot read their documents, data, songs. And as memory fades, the ill-documented resources need to be re-created, re-checked, reformatted. Often, they are even hard to find on your internal server or laptop archive.&lt;/p>
&lt;p>Metadata is a hidden destroyer of the competitiveness of corporate or academic research, or independent content management. It never quoted on external data vendor invoices, it is not planned as a cost item, because metadata, the description of a dataset, a document, a presentation, or song, is meaningless without the resource that it describes. You never buy metadata. But if your dataset comes without proper metadata documentation, you are bound, like Sisyphus, to search for it, to re-arrange it, to check its currency units, its digits, its formatting. Data analysts are reported to spend about 80% of their working hours on data processing and not data analysis &amp;ndash; partly, because data processing is a very laborious task that can be done by computers at a scale far cheaper, and partly because they do not know if the person who sat before them at the same desk has already performed these tasks, or if the person responsible for quality control checked for errors.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-uncut-diamonds-need-to-be-cut-polished-and-you-have-to-make-sure-that-they-come-from-a-legal-source-data-is-similar-it-needs-to-be-tidied-up-checked-and-documented-before-use-photo-dave-fischer">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Uncut diamonds need to be cut, polished, and you have to make sure that they come from a legal source. Data is similar: it needs to be tidied up, checked and documented before use. Photo: Dave Fischer." srcset="
/media/img/gems/Uncut-diamond_Edit_hu4573f19f53e1306ad88770fc5e491871_409761_0317c281e0aba727eb8e1a81805de459.webp 400w,
/media/img/gems/Uncut-diamond_Edit_hu4573f19f53e1306ad88770fc5e491871_409761_1470967ea871e5c3f6f247c839f6d52a.webp 760w,
/media/img/gems/Uncut-diamond_Edit_hu4573f19f53e1306ad88770fc5e491871_409761_1200x1200_fit_q75_h2_lanczos.webp 1200w"
src="https://reprex-next.netlify.app/media/img/gems/Uncut-diamond_Edit_hu4573f19f53e1306ad88770fc5e491871_409761_0317c281e0aba727eb8e1a81805de459.webp"
width="760"
height="506"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
Uncut diamonds need to be cut, polished, and you have to make sure that they come from a legal source. Data is similar: it needs to be tidied up, checked and documented before use. Photo: Dave Fischer.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>Undocumented data is hardly informative – it may be a page in a book, a file in an obsolete file format on a governmental server, an Excel sheet that you do not remember to have checked for updates. Most data are useless, because we do not know how it can inform us, or we do not know if we can trust it. The processing can be a daunting task, not to mention the most boring and often neglected documentation duties after the dataset is final and pronounced error-free by the person in charge of quality control.&lt;/p>
&lt;h2 id="observatory-metadata-services">Our observatory automatically processes and documents the data&lt;/h2>
&lt;p>The good news about documentation and data validation costs is that they can be shared. If many users need GDP/capita data from all over the world in euros, then it is enough if only one entity, a data observatory, collects all GDP and population data expresed in dollars, korunas, and euros, and makes sure that the latest data is correctly translated to euros, and then correctly divided by the latest population figures. These task are error-prone,and should not be repeaeted by every data journalist, NGO employee, PhD student or junior analyst. This is one of the services of our data observatory.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;input checked="" disabled="" type="checkbox"> The tidy data format means that the data has a uniform and clear data structure and semantics, therefore it can be automatically validated for many common errors and can be automatically documented by either our software or any other professional data science application. It is not as strict as the schema for a relational database, but it is strict enough to make, among other things, importing into a database easy.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;input checked="" disabled="" type="checkbox"> The descriptive metadata contains information on how to find the data, access the data, join it with other data (interoperability) and use it, and reuse it, even years from now. Among others, it contains file format information and intellectual property rights information.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;input checked="" disabled="" type="checkbox"> The processing metadata makes the data usable in strictly regulated professional environments, such as in public administration, law firms, investment consultancies, or in scientific research. We give you the entire processing history of the data, which makes peer-review or external audit much easier and cheaper.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;input checked="" disabled="" type="checkbox"> The authoritative copy is held at an independent repository, it has a globally unique identifier that protects you from accidental data loss, mixing up with unfinished an untested version.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;td style="text-align: center;">
&lt;figure id="figure-cutting-the-dataset-to-a-format-with-clear-semantics-and-documenting-it-with-the-fair-metadata-concep-exponentially-increases-the-value-of-data-it-can-be-publisehd-or-sold-at-a-premium-photo-andere-andrehttpscommonswikimediaorgwindexphpcurid4770037">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Cutting the dataset to a format with clear semantics and documenting it with the FAIR metadata concep exponentially increases the value of data. It can be publisehd or sold at a premium. Photo: [Andere Andre](https://commons.wikimedia.org/w/index.php?curid=4770037)." srcset="
/media/img/gems/Diamond_Polisher_hu2b5ca0e8d1290dc6b290d6b4669a6259_449722_27278366bdb30735ec3edb5dd68ce37b.webp 400w,
/media/img/gems/Diamond_Polisher_hu2b5ca0e8d1290dc6b290d6b4669a6259_449722_2022c9c74076769b68c8f788b6835f99.webp 760w,
/media/img/gems/Diamond_Polisher_hu2b5ca0e8d1290dc6b290d6b4669a6259_449722_1200x1200_fit_q75_h2_lanczos.webp 1200w"
src="https://reprex-next.netlify.app/media/img/gems/Diamond_Polisher_hu2b5ca0e8d1290dc6b290d6b4669a6259_449722_27278366bdb30735ec3edb5dd68ce37b.webp"
width="760"
height="506"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
Cutting the dataset to a format with clear semantics and documenting it with the FAIR metadata concep exponentially increases the value of data. It can be publisehd or sold at a premium. Photo: &lt;a href="https://commons.wikimedia.org/w/index.php?curid=4770037" target="_blank" rel="noopener">Andere Andre&lt;/a>.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>While humans are much better at analysing the information and human agency is required for trustworthy AI, computers are much better at processing and documenting data. We apply to important concepts to our data service: we always process the data to the tidy format, we create an authoritative copy, and we always automatically add descriptive and processing metadata.&lt;/p>
&lt;h2 id="value-of-metadata">The value of metadata&lt;/h2>
&lt;p>Metadata is often more valuable and more costly to make than the data itself, yet it remains an elusive concept for senior or financial management. Metadata is information about how to correctly use the data and has no value without the data itself. Data acquisition, such as buying from a data vendor, or paying an opinion polling company, or external data consultants appears among the material costs, but metadata is never sold alone, and you do not see its cost.&lt;/p>
&lt;p>In most cases, the reason why &lt;a href="https://dataandlyrics.com/post/2021-06-18-gold-without-rush/" target="_blank" rel="noopener">there is no gold rush for open data&lt;/a> is that fact that while the EU member states release billions of euros&amp;rsquo; worth data for free, or at very low cost, annually, it comes without proper metadata.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-data-as-serviceservicesdata-as-servicereusable-legal-easy-to-import-interoperable-always-fresh-data-in-tidy-formats-with-a-modern-api-photo-edgar-sotohttpsunsplashcomphotosgb0bzgae1nk">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="[Data-as-Service](/services/data-as-service/)Reusable, legal, easy-to-import, interoperable, always fresh data in tidy formats with a modern API. Photo: [Edgar Soto](https://unsplash.com/photos/gb0BZGae1Nk)." srcset="
/media/img/gems/edgar-soto-gb0BZGae1Nk-unsplash_hu885793c483f74753314f6c800c67a06f_204775_81b97d34c1ccb0eb3994b312d0747e63.webp 400w,
/media/img/gems/edgar-soto-gb0BZGae1Nk-unsplash_hu885793c483f74753314f6c800c67a06f_204775_b3ddf8e86873a66ce16e8636fadc3357.webp 760w,
/media/img/gems/edgar-soto-gb0BZGae1Nk-unsplash_hu885793c483f74753314f6c800c67a06f_204775_1200x1200_fit_q75_h2_lanczos.webp 1200w"
src="https://reprex-next.netlify.app/media/img/gems/edgar-soto-gb0BZGae1Nk-unsplash_hu885793c483f74753314f6c800c67a06f_204775_81b97d34c1ccb0eb3994b312d0747e63.webp"
width="760"
height="506"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
&lt;a href="https://reprex-next.netlify.app/services/data-as-service/">Data-as-Service&lt;/a>&lt;/br>&lt;/br>Reusable, legal, easy-to-import, interoperable, always fresh data in tidy formats with a modern API. Photo: &lt;a href="https://unsplash.com/photos/gb0BZGae1Nk" target="_blank" rel="noopener">Edgar Soto&lt;/a>.
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>If the data source is cheap or has a low quality, you do not even get it. If you do not have it, it will show up as a human resource cost in research (when your analysist or junior researcher are spending countless hours to find out the missing metadata information on the correct use of the data) or in sales costs (when you try to reuse a research, consulting or legal product and you have comb through your archive and retest elements again and again.)&lt;/p>
&lt;ul>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> The data, together with the descriptive and administrative metadata, and links to the use license and the authoritative copy can be found in our API. Try it out!&lt;/li>
&lt;/ul></description></item><item><title>Including Indicators from Arab Barometer in Our Observatory</title><link>https://reprex-next.netlify.app/post/2021-06-28-arabbarometer/</link><pubDate>Mon, 28 Jun 2021 09:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-06-28-arabbarometer/</guid><description>&lt;p>&lt;em>A new version of the retroharmonize R package – which is working with retrospective, ex post harmonization of survey data – was released yesterday after peer-review on CRAN. It allows us to compare opinion polling data from the Arab Barometer with the Eurobarometer and Afrorbarometer. This is the first version that is released in the rOpenGov community, a community of R package developers on open government data analytics and related topics.&lt;/em>&lt;/p>
&lt;p>Surveys are the most important data sources in social and economic
statistics – they ask people about their lives, their attitudes and
self-reported actions, or record data from companies and NGOs. Survey
harmonization makes survey data comparable across time and countries. It
is very important, because often we do not know without comparison if an
indicator value is &lt;em>low&lt;/em> or &lt;em>high&lt;/em>. If 40% of the people think that
&lt;em>climate change is a very serious problem&lt;/em>, it does not really tell us
much without knowing what percentage of the people answered this
question similarly a year ago, or in other parts of the world.&lt;/p>
&lt;p>With the help of Ahmed Shabani and Yousef Ibrahim, we created a third
case study after the
&lt;a href="https://retroharmonize.dataobservatory.eu/articles/eurobarometer.html" target="_blank" rel="noopener">Eurobarometer&lt;/a>,
and
&lt;a href="https://retroharmonize.dataobservatory.eu/articles/afrobarometer.html" target="_blank" rel="noopener">Afrobarometer&lt;/a>,
about working with the &lt;a href="https://retroharmonize.dataobservatory.eu/articles/arabbarometer.html" target="_blank" rel="noopener">Arab
Barometer&lt;/a>
harmonized survey data files.&lt;/p>
&lt;p>&lt;em>Ex ante&lt;/em> survey harmonization means that researchers design
questionnaires that are asking the same questions with the same survey
methodology in repeated, distinct times (waves), or across different
countries with carefully harmonized question translations. &lt;em>Ex post&lt;/em>
harmonizations means that the resulting data has the same variable
names, same variable coding, and can be joined into a tidy data frame
for joint statistical analysis. While seemingly a simple task, it
involves plenty of metadata adjustments, because established survey
programs like Eurobarometer, Afrobarometer or Arab Barometer have
several decades of history, and several decades of coding practices and
file formatting legacy.&lt;/p>
&lt;ul>
&lt;li>&lt;em>Variable harmonization&lt;/em> means that if the same question is called
in one microdata source &lt;code>Q108&lt;/code> and the other &lt;code>eval-parl-elections&lt;/code>
then we make sure that they get a harmonize and machine readable
name without spaces and special characters.&lt;/li>
&lt;li>&lt;em>Variable label harmonization&lt;/em> means that the same questionnaire
items get the same numeric coding and same categorical labels.&lt;/li>
&lt;li>&lt;em>Missing case harmonization&lt;/em> means that various forms of missingness
are treated the same way.&lt;/li>
&lt;/ul>
&lt;figure id="figure-for-the-evaluation-of-the-economic-situation-dataset-get-the-country-averages-and-aggregates-from-zenodohttpdoiorg105281zenodo5036432-and-the-plot-in-jpg-or-png-from-figsharehttpsfigsharecomarticlesfigurearab_barometer_5_econ_eval_by_country_png14865498">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/img/blogposts_2021/arab_barometer_5_evon_eval_by_country.png" alt="For the evaluation of the economic situation dataset, get the country averages and aggregates from [Zenodo](http://doi.org/10.5281/zenodo.5036432), and the plot in `jpg` or `png` from [figshare](https://figshare.com/articles/figure/arab_barometer_5_econ_eval_by_country_png/14865498)." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
For the evaluation of the economic situation dataset, get the country averages and aggregates from &lt;a href="http://doi.org/10.5281/zenodo.5036432" target="_blank" rel="noopener">Zenodo&lt;/a>, and the plot in &lt;code>jpg&lt;/code> or &lt;code>png&lt;/code> from &lt;a href="https://figshare.com/articles/figure/arab_barometer_5_econ_eval_by_country_png/14865498" target="_blank" rel="noopener">figshare&lt;/a>.
&lt;/figcaption>&lt;/figure>
&lt;p>In our new &lt;a href="https://retroharmonize.dataobservatory.eu/articles/arabbarometer.html" target="_blank" rel="noopener">Arab Barometer case
study&lt;/a>,
the evaulation of parliamentary elections has the following labels. We
code them consistently &lt;code>1: free_and_fair&lt;/code>, &lt;code>2: some_minor_problems&lt;/code>,
&lt;code>3: some_major_problems&lt;/code> and &lt;code>4: not_free&lt;/code>.&lt;/p>
&lt;table>
&lt;colgroup>
&lt;col style="width: 50%" />
&lt;col style="width: 50%" />
&lt;/colgroup>
&lt;tbody>
&lt;tr class="odd">
&lt;td style="text-align: left;">“0. missing”&lt;/td>
&lt;td style="text-align: left;">“1. they were completely free and fair”&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td style="text-align: left;">“2. they were free and fair, with some minor problems”&lt;/td>
&lt;td style="text-align: left;">“3. they were free and fair, with some major problems”&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td style="text-align: left;">“4. they were not free and fair”&lt;/td>
&lt;td style="text-align: left;">“8. i don’t know”&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td style="text-align: left;">“9. declined to answer”&lt;/td>
&lt;td style="text-align: left;">“Missing”&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td style="text-align: left;">“They were completely free and fair”&lt;/td>
&lt;td style="text-align: left;">“They were free and fair, with some minor breaches”&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td style="text-align: left;">“They were free and fair, with some major breaches”&lt;/td>
&lt;td style="text-align: left;">“They were not free and fair”&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td style="text-align: left;">“Don’t know”&lt;/td>
&lt;td style="text-align: left;">“Refuse”&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td style="text-align: left;">“Completely free and fair”&lt;/td>
&lt;td style="text-align: left;">“Free and fair, but with minor problems”&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td style="text-align: left;">“Free and fair, with major problems”&lt;/td>
&lt;td style="text-align: left;">“Not free or fair”&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td style="text-align: left;">“Don’t know (Do not read)”&lt;/td>
&lt;td style="text-align: left;">“Decline to answer (Do not read)”&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>Of course, this harmonization is essential to get clean results like this:&lt;/p>
&lt;figure id="figure-for-evaluation-or-reuse-of-parliamentary-elections-dataset-get-the-replication-data-and-the-code-from-the-zenodohhttpsdoiorg105281zenodo5034759-open-repository">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="For evaluation or reuse of parliamentary elections dataset get the replication data and the code from the [Zenodo](hhttps://doi.org/10.5281/zenodo.5034759) open repository." srcset="
/media/img/blogposts_2021/arabb-comparison-country-chart_hu876e56138097bf35e9ab80c0a7351314_159521_30b9d9bccbe8f347c912dbe10ef5159c.webp 400w,
/media/img/blogposts_2021/arabb-comparison-country-chart_hu876e56138097bf35e9ab80c0a7351314_159521_f7e62366b8310160e9cdd16714a5ac44.webp 760w,
/media/img/blogposts_2021/arabb-comparison-country-chart_hu876e56138097bf35e9ab80c0a7351314_159521_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/arabb-comparison-country-chart_hu876e56138097bf35e9ab80c0a7351314_159521_30b9d9bccbe8f347c912dbe10ef5159c.webp"
width="506"
height="760"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
For evaluation or reuse of parliamentary elections dataset get the replication data and the code from the &lt;a href="hhttps://doi.org/10.5281/zenodo.5034759">Zenodo&lt;/a> open repository.
&lt;/figcaption>&lt;/figure>
&lt;p>In our case study, we had three forms of missingness: the respondent
&lt;em>did not know&lt;/em> the answer, the respondent &lt;em>did not want&lt;/em> to answer, and
at last, in some cases the &lt;em>respondent was not asked&lt;/em>, because the
country held no parliamentary elections. While in numerical processing,
all these answers must be left out from calculating averages, for
example, in a more detailed, categorical analysis they represent very
different cases. A high level of refusal to answer may be an indicator
of surpressing democratic opinion forming in itself.&lt;/p>
&lt;p>Survey harmonization with many countries entails tens of thousands of
small data management task, which, unless automatically documented,
logged, and created with a reproducible code, is a helplessly
error-prone process. We believe that our open-source software will bring
many new statistical information to the light, which, while legally
open, was never processed due to the large investment needed.&lt;/p>
&lt;p>We also started building experimental APIs data is running
&lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize&lt;/a> regularly.
We will place cultural access and participation data in the &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital
Music Observatory&lt;/a>, climate
awareness, policy support and self-reported mitigation strategies into
the &lt;a href="https://greendeal.dataobservatory.eu/" target="_blank" rel="noopener">Green Deal Data
Observatory&lt;/a>, and economy and
well-being data into our &lt;a href="https://economy.dataobservatory.eu/" target="_blank" rel="noopener">Economy Data
Observatory&lt;/a>.&lt;/p>
&lt;h2 id="further-plans">Further plans&lt;/h2>
&lt;p>Retrospective survey harmonization is a far more complex task than this
blogpost suggest. Retrospective survey harmonization is a far more complex task than this blogpost suggest, because established survey programs have gathered decades of legacy data in legacy coding schemes and legacy file formats. Putting the data right, and especially putting the invaluable descriptive and administrative (processing) metadata right is a huge undertaking. We are releasing example codes, datasets and charts for researchers to comapre our harmonized results with theirs, and improve our software. We are releasing example codes, datasets and charts
for researchers to comapre our harmonized results with theirs, and
improve our software.&lt;/p>
&lt;h3 id="use-our-software">Use our software&lt;/h3>
&lt;p>The &lt;code>retroharmonize&lt;/code> R package can be freely used, modified and
distributed under the GPL-3 license. For the main developer and
contributors, see the
&lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">package&lt;/a> homepage. If you
use it for your work, please kindly cite it as:&lt;/p>
&lt;p>Daniel Antal (2021). retroharmonize: Ex Post Survey Data Harmonization.
R package version 0.1.17. &lt;a href="https://doi.org/10.5281/zenodo.5034752" target="_blank" rel="noopener">https://doi.org/10.5281/zenodo.5034752&lt;/a>&lt;/p>
&lt;p>Download the &lt;a href="https://reprex-next.netlify.app/media/bibliography/cite-retroharmonize.bib" target="_blank">BibLaTeX entry&lt;/a>.&lt;/p>
&lt;h3 id="tutorial-to-work-with-the-arab-barometer-survey-data">Tutorial to work with the Arab Barometer survey data&lt;/h3>
&lt;p>Daniel Antal, &amp;amp; Ahmed Shaibani. (2021, June 26). Case Study: Working
With Arab Barometer Surveys for the retroharmonize R package (Version
0.1.6). Zenodo. &lt;a href="https://doi.org/10.5281/zenodo.5034759" target="_blank" rel="noopener">https://doi.org/10.5281/zenodo.5034759&lt;/a>&lt;/p>
&lt;p>For the replication data to report potential
&lt;a href="https://github.com/rOpenGov/retroharmonize/issues" target="_blank" rel="noopener">issues&lt;/a> and
improvement suggestions with the code:&lt;/p>
&lt;p>Daniel Antal, &amp;amp; Ahmed Shaibani. (2021). Replication Data for the
retroharmonize R Package Case Study: Working With Arab Barometer Surveys
(Version 0.1.6) [Data set]. Zenodo.
&lt;a href="https://doi.org/10.5281/zenodo.5034741" target="_blank" rel="noopener">https://doi.org/10.5281/zenodo.5034741&lt;/a>&lt;/p>
&lt;h3 id="experimental-api">Experimental API&lt;/h3>
&lt;p>We are also experimenting with the automated placement of authoritative
and citeable figures and datasets in open repositories. For the climate
awareness dataset get the country averages and aggregates from
&lt;a href="http://doi.org/10.5281/zenodo.5036432" target="_blank" rel="noopener">Zenodo&lt;/a>, and the plot in &lt;code>jpg&lt;/code>
or &lt;code>png&lt;/code> from &lt;a href="https://figshare.com/articles/figure/arab_barometer_5_econ_eval_by_country_png/14865498" target="_blank" rel="noopener">figshare&lt;/a>.
Our plan is to release open data in a modern API with rich descriptive
metadata meeting the &lt;em>Dublin Core&lt;/em> and &lt;em>DataCite&lt;/em> standards, and further
administrative metadata for correct coding, joining and further
manipulating or data, or for easy import into your database.&lt;/p>
&lt;h3 id="join-our-open-source-effort">Join our open source effort&lt;/h3>
&lt;p>Want to help us improve our open data service? Include
&lt;a href="https://www.latinobarometro.org/lat.jsp" target="_blank" rel="noopener">Lationbarómetro&lt;/a> and the
&lt;a href="https://caucasusbarometer.org/en/datasets/" target="_blank" rel="noopener">Caucasus Barometer&lt;/a> in our
offering? Join the rOpenGov community of R package developers, an our
open collaboration to create the automated data observatories. We are
not only looking for
&lt;a href="https://reprex-next.netlify.app/authors/developer/">developers&lt;/a>,
but &lt;a href="https://reprex-next.netlify.app/authors/curator/">data
curators&lt;/a> and
&lt;a href="https://reprex-next.netlify.app/authors/team/">service design
associates&lt;/a>, too.&lt;/p></description></item><item><title>Open Data - The New Gold Without the Rush</title><link>https://reprex-next.netlify.app/post/2021-06-18-gold-without-rush/</link><pubDate>Fri, 18 Jun 2021 17:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-06-18-gold-without-rush/</guid><description>&lt;p>&lt;em>If open data is the new gold, why even those who release fail to reuse it? We created an open collaboration of data curators and open-source developers to dig into novel open data sources and/or increase the usability of existing ones. We transform reproducible research software into research- as-service.&lt;/em>&lt;/p>
&lt;p>Every year, the EU announces that billions and billions of data are now “open” again, but this is not gold. At least not in the form of nicely minted gold coins, but in gold dust and nuggets found in the muddy banks of chilly rivers. There is no rush for it, because panning out its value requires a lot of hours of hard work. Our goal is to automate this work to make open data usable at scale, even in trustworthy AI solutions.&lt;/p>
&lt;figure id="figure-there-is-no-rush-for-it-because-panning-out-its-value-requires-a-lot-of-hours-of-hard-work-our-goal-is-to-automate-this-work-to-make-open-data-usable-at-scale-even-in-trustworthy-ai-solutions">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="There is no rush for it, because panning out its value requires a lot of hours of hard work. Our goal is to automate this work to make open data usable at scale, even in trustworthy AI solutions." srcset="
/media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_b042523dffe8143dea3d8c8c9c3262f4.webp 400w,
/media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_faa00e96d3d0b700cfcf1daa513f3ad2.webp 760w,
/media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_b042523dffe8143dea3d8c8c9c3262f4.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
There is no rush for it, because panning out its value requires a lot of hours of hard work. Our goal is to automate this work to make open data usable at scale, even in trustworthy AI solutions.
&lt;/figcaption>&lt;/figure>
&lt;p>Most open data is not public, it is not downloadable from the Internet – in the EU parlance, “open” only means a legal entitlement to get access to it. And even in the rare cases when data is open and public, often it is mired by data quality issues. We are working on the prototypes of a data-as-service and research-as-service built with open-source statistical software that taps into various and often neglected open data sources.&lt;/p>
&lt;p>We are in the prototype phase in June and our intentions are to have a well-functioning service by the time of the conference, because we are working only with open-source software elements; our technological readiness level is already very high. The novelty of our process is that we are trying to further develop and integrate a few open-source technology items into technologically and financially sustainable data-as-service and even research-as-service solutions.&lt;/p>
&lt;figure id="figure-our-review-of-about-80-eu-un-and-oecd-data-observatories-reveals-that-most-of-them-do-not-use-these-organizationss-open-data---instead-they-use-various-and-often-not-well-processed-proprietary-sources">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Our review of about 80 EU, UN and OECD data observatories reveals that most of them do not use these organizations&amp;#39;s open data - instead they use various, and often not well processed proprietary sources." srcset="
/media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_0079ea9844f6c5e52b52fd0e627467a2.webp 400w,
/media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_ecd6d08ba5e9bac19c8173546f036651.webp 760w,
/media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_0079ea9844f6c5e52b52fd0e627467a2.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Our review of about 80 EU, UN and OECD data observatories reveals that most of them do not use these organizations&amp;rsquo;s open data - instead they use various, and often not well processed proprietary sources.
&lt;/figcaption>&lt;/figure>
&lt;p>We are taking a new and modern approach to the &lt;code>data observatory&lt;/code> concept, and modernizing it with the application of 21st century data and metadata standards, the new results of reproducible research and data science. Various UN and OECD bodies, and particularly the European Union support or maintain more than 60 data observatories, or permanent data collection and dissemination points, but even these do not use these organizations and their members open data. We are building open-source data observatories, which run open-source statistical software that automatically processes and documents reusable public sector data (from public transport, meteorology, tax offices, taxpayer funded satellite systems, etc.) and reusable scientific data (from EU taxpayer funded research) into new, high quality statistical indicators.&lt;/p>
&lt;figure id="figure-we-are-taking-a-new-and-modern-approach-to-the-data-observatory-concept-and-modernizing-it-with-the-application-of-21st-century-data-and-metadata-standards-the-new-results-of-reproducible-research-and-data-science">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/img/slides/automated_observatory_value_chain.jpg" alt="We are taking a new and modern approach to the ‘data observatory’ concept, and modernizing it with the application of 21st century data and metadata standards, the new results of reproducible research and data science" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
We are taking a new and modern approach to the ‘data observatory’ concept, and modernizing it with the application of 21st century data and metadata standards, the new results of reproducible research and data science
&lt;/figcaption>&lt;/figure>
&lt;ul>
&lt;li>We are building various open-source data collection tools in R and Python to bring up data from big data APIs and legally open, but not public, and not well served data sources. For example, we are working on capturing representative data from the Spotify API or creating harmonized datasets from the Eurobarometer and Afrobarometer survey programs.&lt;/li>
&lt;li>Open data is usually not public; whatever is legally accessible is usually not ready to use for commercial or scientific purposes. In Europe, almost all taxpayer funded data is legally open for reuse, but it is usually stored in heterogeneous formats, processed into an original government or scientific need, and with various and low documentation standards. Our expert data curators are looking for new data sources that should be (re-) processed and re-documented to be usable for a wider community. We would like to introduce our service flow, which touches upon many important aspects of data scientist, data engineer and data curatorial work.&lt;/li>
&lt;li>We believe that even such generally trusted data sources as Eurostat often need to be reprocessed, because various legal and political constraints do not allow the common European statistical services to provide optimal quality data – for example, on the regional and city levels.&lt;/li>
&lt;li>With &lt;a href="https://reprex-next.netlify.app/authors/ropengov/">rOpenGov&lt;/a> and other partners, we are creating open-source statistical software in R to re-process these heterogenous and low-quality data into tidy statistical indicators to automatically validate and document it.&lt;/li>
&lt;li>We are carefully documenting and releasing administrative, processing, and descriptive metadata, following international metadata standards, to make our data easy to find and easy to use for data analysts.&lt;/li>
&lt;li>We are automatically creating depositions and authoritative copies marked with an individual digital object identifier (DOI) to maintain data integrity.&lt;/li>
&lt;li>We are building simple databases and supporting APIs that release the data without restrictions, in a tidy format that is easy to join with other data, or easy to join into databases, together with standardized metadata.&lt;/li>
&lt;li>We maintain observatory websites (see: &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observatory&lt;/a>, &lt;a href="https://greendeal.dataobservatory.eu/" target="_blank" rel="noopener">Green Deal Data Observatory&lt;/a>, &lt;a href="https://economy.dataobservatory.eu/" target="_blank" rel="noopener">Economy Data Observatory&lt;/a>) where not only the data is available, but we provide tutorials and use cases to make it easier to use them. Our mission is to show a modern, 21st century reimagination of the data observatory concept developed and supported by the UN, EU and OECD, and we want to show that modern reproducible research and open data could make the existing 60 data observatories and the planned new ones grow faster into data ecosystems.&lt;/li>
&lt;/ul>
&lt;p>We are working around the open collaboration concept, which is well-known in open source software development and reproducible science, but we try to make this agile project management methodology more inclusive, and include data curators, and various institutional partners into this approach. Based around our early-stage startup, Reprex, and the open-source developer community rOpenGov, we are working together with other developers, data scientists, and domain specific data experts in climate change and mitigation, antitrust and innovation policies, and various aspects of the music and film industry.&lt;/p>
&lt;figure id="figure-our-open-collaboration-is-truly-open-new-data-curatorsauthorscuratordevelopersauthorsdeveloper-and-service-designersauthorsteam-even-volunteers-and-citizen-scientists-are-welcome-to-join">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Our open collaboration is truly open: new [data curators](/authors/curator/),[developers](/authors/developer/) and [service designers](/authors/team/), even volunteers and citizen scientists are welcome to join." srcset="
/media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_a07a8e618fa7317f6f8256b9a334262e.webp 400w,
/media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_3a4ae7f72478fd880961b08e1f7075dd.webp 760w,
/media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_a07a8e618fa7317f6f8256b9a334262e.webp"
width="760"
height="427"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Our open collaboration is truly open: new &lt;a href="https://reprex-next.netlify.app/authors/curator/">data curators&lt;/a>,&lt;a href="https://reprex-next.netlify.app/authors/developer/">developers&lt;/a> and &lt;a href="https://reprex-next.netlify.app/authors/team/">service designers&lt;/a>, even volunteers and citizen scientists are welcome to join.
&lt;/figcaption>&lt;/figure>
&lt;p>Our open collaboration is truly open: new &lt;a href="https://reprex-next.netlify.app/authors/curator/">data curators&lt;/a>, data scientists and data engineers are welcome to join. We develop open-source software in an agile way, so you can join in with an intermediate programming skill to build unit tests or add new functionality, and if you are a beginner, you can start with documentation and testing our tutorials. For business, policy, and scientific data analysts, we provide unexploited, exciting new datasets. Advanced developers can &lt;a href="https://reprex-next.netlify.app/authors/developer/">join&lt;/a> our development team: the statistical data creation is mainly made in the R language, and the service infrastructure in Python and Go components.&lt;/p></description></item><item><title>There are Numerous Advantages of Switching from a National Level of the Analysis to a Sub National Level</title><link>https://reprex-next.netlify.app/post/2021-06-16-regions-release/</link><pubDate>Wed, 16 Jun 2021 12:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-06-16-regions-release/</guid><description>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/img/package_screenshots/regions_017_169.png" alt="" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;p>The new version of our &lt;a href="https://ropengov.org/" target="_blank" rel="noopener">rOpenGov&lt;/a> R package
&lt;a href="https://regions.dataobservatory.eu/" target="_blank" rel="noopener">regions&lt;/a> was released today on
CRAN. This package is one of the engines of our experimental open
data-as-service &lt;a href="https://greendeal.dataobservatory.eu/" target="_blank" rel="noopener">Green Deal Data
Observatory&lt;/a> , &lt;a href="https://economy.dataobservatory.eu/" target="_blank" rel="noopener">Economy Data
Observatory&lt;/a> , &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music
Observatory&lt;/a> prototypes, which aim to
place open data packages into open-source applications.&lt;/p>
&lt;p>In international comparison the use of nationally aggregated indicators
often have many disadvantages: they inhibit very different levels of
homogeneity, and data is often very limited in number of observations
for a cross-sectional analysis. When comparing European countries, a few
missing cases can limit the cross-section of countries to around 20
cases which disallows the use of many analytical methods. Working with
sub-national statistics has many advantages: the similarity of the
aggregation level and high number of observations can allow more precise
control of model parameters and errors, and the number of observations
grows from 20 to 200-300.&lt;/p>
&lt;figure id="figure-the-change-from-national-to-sub-national-level-comes-with-a-huge-data-processing-price-internal-administrative-boundaries-their-names-codes-codes-change-very-frequently">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="The change from national to sub-national level comes with a huge data processing price: internal administrative boundaries, their names, codes codes change very frequently." srcset="
/media/img/blogposts_2021/indicator_with_map_hue9f606f6489f63a22f67aeb7e2b3402b_98843_df043b13fb62aa7b45aa15fad51f4229.webp 400w,
/media/img/blogposts_2021/indicator_with_map_hue9f606f6489f63a22f67aeb7e2b3402b_98843_09a0d6124e334c5f1727420a059512a9.webp 760w,
/media/img/blogposts_2021/indicator_with_map_hue9f606f6489f63a22f67aeb7e2b3402b_98843_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/indicator_with_map_hue9f606f6489f63a22f67aeb7e2b3402b_98843_df043b13fb62aa7b45aa15fad51f4229.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
The change from national to sub-national level comes with a huge data processing price: internal administrative boundaries, their names, codes codes change very frequently.
&lt;/figcaption>&lt;/figure>
&lt;p>Yet the change from national to sub-national level comes with a huge
data processing price. While national boundaries are relatively stable,
with only a handful of changes in each recent decade. The change of
national boundaries requires a more-or-less global consensus. But states
are free to change their internal administrative boundaries, and they do
it with large frequency. This means that the names, identification codes
and boundary definitions of sub-national regions change very frequently.
Joining data from different sources and different years can be very
difficult.&lt;/p>
&lt;figure id="figure-our-regions-r-packagehttpsregionsdataobservatoryeu-helps-the-data-processing-validation-and-imputation-of-sub-national-regional-datasets-and-their-coding">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Our [regions R package](https://regions.dataobservatory.eu/) helps the data processing, validation and imputation of sub-national, regional datasets and their coding." srcset="
/media/img/blogposts_2021/recoded_indicator_with_map_hubda8124fbfd6305eacfd3d4f0fcd06cc_71873_65df57cf4311bb2623535a1a5be044c0.webp 400w,
/media/img/blogposts_2021/recoded_indicator_with_map_hubda8124fbfd6305eacfd3d4f0fcd06cc_71873_81a53fd42fac7f0c3fe4e1a89d5b7892.webp 760w,
/media/img/blogposts_2021/recoded_indicator_with_map_hubda8124fbfd6305eacfd3d4f0fcd06cc_71873_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/recoded_indicator_with_map_hubda8124fbfd6305eacfd3d4f0fcd06cc_71873_65df57cf4311bb2623535a1a5be044c0.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Our &lt;a href="https://regions.dataobservatory.eu/" target="_blank" rel="noopener">regions R package&lt;/a> helps the data processing, validation and imputation of sub-national, regional datasets and their coding.
&lt;/figcaption>&lt;/figure>
&lt;p>There are numerous advantages of switching from a national level of the
analysis to a sub-national level comes with a huge price in data
processing, validation and imputation, and the
&lt;a href="https://regions.dataobservatory.eu/" target="_blank" rel="noopener">regions&lt;/a> package aims to help this
process.&lt;/p>
&lt;p>You can review the problem, and the code that created the two map
comparisons, in the &lt;a href="https://regions.dataobservatory.eu/articles/maping.html" target="_blank" rel="noopener">Maping Regional Data, Maping Metadata
Problems&lt;/a>
vignette article of the package. A more detailed problem description can
be found in &lt;a href="https://regions.dataobservatory.eu/articles/Regional_stats.html" target="_blank" rel="noopener">Working With Regional, Sub-National Statistical
Products&lt;/a>.&lt;/p>
&lt;p>This package is an offspring of the
&lt;a href="https://ropengov.github.io/eurostat/" target="_blank" rel="noopener">eurostat&lt;/a> package on
&lt;a href="https://ropengov.github.io/" target="_blank" rel="noopener">rOpenGov&lt;/a>. It started as a tool to
validate and re-code regional Eurostat statistics, but it aims to be a
general solution for all sub-national statistics. It will be developed
parallel with other rOpenGov packages.&lt;/p>
&lt;h2 id="get-the-package">Get the Package&lt;/h2>
&lt;p>You can install the development version from
&lt;a href="https://github.com/" target="_blank" rel="noopener">GitHub&lt;/a> with:&lt;/p>
&lt;pre>&lt;code>devtools::install_github(&amp;quot;rOpenGov/regions&amp;quot;)
&lt;/code>&lt;/pre>
&lt;p>or the released version from CRAN:&lt;/p>
&lt;pre>&lt;code>install.packages(&amp;quot;regions&amp;quot;)
&lt;/code>&lt;/pre>
&lt;p>You can review the complete package documentation on
&lt;a href="https://regions.dataobservatory.eu/" target="_blank" rel="noopener">regions.dataobservaotry.eu&lt;/a>. If
you find any problems with the code, please raise an issue on
&lt;a href="https://github.com/rOpenGov/regions" target="_blank" rel="noopener">Github&lt;/a>. Pull requests are welcome
if you agree with the &lt;a href="https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html" target="_blank" rel="noopener">Contributor Code of
Conduct&lt;/a>&lt;/p>
&lt;p>If you use &lt;code>regions&lt;/code> in your work, please cite the
package as:
Daniel Antal, Kasia Kulma, Istvan Zsoldos, &amp;amp; Leo Lahti. (2021, June 16). regions (Version 0.1.7). CRAN. &lt;a href="%28https://doi.org/10.5281/zenodo.4965909%29">http://doi.org/10.5281/zenodo.4965909&lt;/a>&lt;/p>
&lt;p>&lt;a href="https://cran.r-project.org/package=regions" target="_blank" rel="noopener">
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://www.r-pkg.org/badges/version/regions" alt="CRAN_Status_Badge" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/a>&lt;/p>
&lt;h2 id="join-us">Join us&lt;/h2>
&lt;p>&lt;em>Join our open collaboration Economy Data Observatory team as a &lt;a href="https://reprex-next.netlify.app/authors/curator">data curator&lt;/a>, &lt;a href="https://reprex-next.netlify.app/authors/developer">developer&lt;/a> or &lt;a href="https://reprex-next.netlify.app/authors/team">business developer&lt;/a>. More interested in environmental impact analysis? Try our &lt;a href="https://greendeal.dataobservatory.eu/#contributors" target="_blank" rel="noopener">Green Deal Data Observatory&lt;/a> team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href="https://music.dataobservatory.eu/#contributors" target="_blank" rel="noopener">Digital Music Observatory&lt;/a> team!&lt;/em>&lt;/p>
&lt;p>&lt;a href="https://twitter.com/intent/follow?screen_name=EconDataObs" target="_blank" rel="noopener">
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://img.shields.io/twitter/follow/EconDataObs.svg?style=social" alt="Follow GreenDealObs" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/a>&lt;/p></description></item><item><title>Open Data is Like Gold in the Mud Below the Chilly Waves of Mountain Rivers</title><link>https://reprex-next.netlify.app/post/2021-06-10-founder-daniel-antal/</link><pubDate>Thu, 10 Jun 2021 07:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-06-10-founder-daniel-antal/</guid><description>
&lt;figure id="figure-open-data-is-like-gold-in-the-mud-below-the-chilly-waves-of-mountain-rivers-panning-it-out-requires-a-lot-of-patience-or-a-good-machine">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Open data is like gold in the mud below the chilly waves of mountain rivers. Panning it out requires a lot of patience, or a good machine." srcset="
/media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_b042523dffe8143dea3d8c8c9c3262f4.webp 400w,
/media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_faa00e96d3d0b700cfcf1daa513f3ad2.webp 760w,
/media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_b042523dffe8143dea3d8c8c9c3262f4.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Open data is like gold in the mud below the chilly waves of mountain rivers. Panning it out requires a lot of patience, or a good machine.
&lt;/figcaption>&lt;/figure>
&lt;p>&lt;strong>As the founder of the automated data observatories that are part of Reprex’s core activities, what type of data do you usually use in your day-to-day work?&lt;/strong>&lt;/p>
&lt;p>The automated data observatories are results of syndicated research, data pooling, and other creative solutions to the problem of missing or hard-to-find data. The music industry is a very fragmented industry, where market research budgets and data are scattered in tens of thousands of small organizations in Europe. Working for the music and film industry as a data analyst and economist was always a pain because most of the efforts went into trying to find any data that can be analyzed. I spent most of the last 7-8 years trying to find any sort of information—from satellites to government archives—that could be formed into actionable data. I see three big sources of information: textual,numeric, and continuous recordings for on-site, offsite, and satellite sensors. I am much better with numbers than with natural language processing, and I am &lt;a href="https://greendeal.dataobservatory.eu/post/2021-06-06-tutorial-cds/" target="_blank" rel="noopener">improving with sensory sources&lt;/a>. But technically, I can mint any systematic information—the text of an old book, a satellite image, or an opinion poll—into datasets.&lt;/p>
&lt;p>&lt;strong>For you, what would be the ultimate dataset, or datasets that you would like to see in the Economy Data Observatory?&lt;/strong>&lt;/p>
&lt;p>I am a data scientist now, but I used to be a regulatory economist, and I have worked a lot with competition policy and monopoly regulation issues. Our observatories can automatically monitor market and environmental processes, which would allow us to get into computational antitrust. Peter Ormosi, our competition curator, is particularly &lt;a href="https://economy.dataobservatory.eu/post/2021-06-02-data-curator-peter-ormosi/" target="_blank" rel="noopener">interested in&lt;/a> killer acquisitions: approved mergers of big companies that end up piling up patents that are not used. I am more interested in describing systematically which markets are getting more concentrated and more competitive, in real time. Does data concentration coincide with market concentration?&lt;/p>
&lt;p>To bring an example from the realm of our &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observatory&lt;/a>, which was a prototype to this one, I have been working for some time on creating streaming volume and price indexes, like the &lt;em>Dow Jones Industrial Average&lt;/em> or the various bond market indexes, that talk more about price, demand, and potential revenue in music streaming markets all over the world. We did a first take on this in the &lt;a href="https://ceereport2020.ceemid.eu/" target="_blank" rel="noopener">Central European Music Industry Report&lt;/a> and recently we iterated on the model for the &lt;em>UK Intellectual Property Office&lt;/em> and the &lt;em>UK Music Creators’ Earnings&lt;/em> project. We want to take this further to create a pan-Europe streaming market index, and we will be probably the first to actually be able to report on music market concentrations, and in fact, more or less in a real-time mode.&lt;/p>
&lt;figure id="figure-we-would-like-to-further-developer-our-20-country-streaming-indexeshttpsceereport2020ceemideumarkethtmlceemid-ci-volume-indexes-into-a-global-music-market-index">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="We would like to further developer our 20-country [streaming indexes]((https://ceereport2020.ceemid.eu/market.html#ceemid-ci-volume-indexes)) into a global music market index." srcset="
/media/img/blogposts_2021/medianvalue-1_hu5941f179e15628adbbb6d4dc0db86cd1_46382_59d954e926db1ce3ce9376aac454a3aa.webp 400w,
/media/img/blogposts_2021/medianvalue-1_hu5941f179e15628adbbb6d4dc0db86cd1_46382_75d58bfbbfae9d25c5551030d6d4206a.webp 760w,
/media/img/blogposts_2021/medianvalue-1_hu5941f179e15628adbbb6d4dc0db86cd1_46382_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/medianvalue-1_hu5941f179e15628adbbb6d4dc0db86cd1_46382_59d954e926db1ce3ce9376aac454a3aa.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
We would like to further developer our 20-country &lt;a href="%28https://ceereport2020.ceemid.eu/market.html#ceemid-ci-volume-indexes%29">streaming indexes&lt;/a> into a global music market index.
&lt;/figcaption>&lt;/figure>
&lt;p>&lt;strong>Is there a number or piece of information that recently surprised you? If so, what was it?&lt;/strong>&lt;/p>
&lt;p>There were a few numbers that surprised me, and some of them were brought up by our observatory teams. Karel is &lt;a href="post/2021-06-08-data-curator-karel-volckaert/">talking&lt;/a> about the fact that not all green energy is green at all: many hydropower stations contribute to the greenhouse effect and not reduce it. Annette brought up the growing interest in the &lt;a href="https://reprex-next.netlify.app/post/2021-06-09-team-annette-wong/">Dalmatian breed&lt;/a> after the Disney &lt;em>101 Dalmatians&lt;/em> movies, and it reminded me of the astonishing growth in interest for chess sets, chess tutorials, and platform subscriptions after the success of Netflix’s &lt;em>The Queen’s Gambit&lt;/em>.&lt;/p>
&lt;figure id="figure-the-queens-gambit-chess-boom-moves-online-by-rachael-dottle-on-bloombergcomhttpswwwbloombergcomgraphics2020-chess-boom">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="*The Queen’s Gambit’ Chess Boom Moves Online By Rachael Dottle* on [bloomberg.com](https://www.bloomberg.com/graphics/2020-chess-boom/)" srcset="
/media/img/blogposts_2021/queens_gambit_bloomberg_hub50434a1789646b36daf41ad10e65b52_92708_4fc47acea402086dd3891772877289db.webp 400w,
/media/img/blogposts_2021/queens_gambit_bloomberg_hub50434a1789646b36daf41ad10e65b52_92708_b60a154be5ab781fb70d16f62f39966c.webp 760w,
/media/img/blogposts_2021/queens_gambit_bloomberg_hub50434a1789646b36daf41ad10e65b52_92708_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/media/img/blogposts_2021/queens_gambit_bloomberg_hub50434a1789646b36daf41ad10e65b52_92708_4fc47acea402086dd3891772877289db.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
&lt;em>The Queen’s Gambit’ Chess Boom Moves Online By Rachael Dottle&lt;/em> on &lt;a href="https://www.bloomberg.com/graphics/2020-chess-boom/" target="_blank" rel="noopener">bloomberg.com&lt;/a>
&lt;/figcaption>&lt;/figure>
&lt;p>Annette is talking about the importance of cultural influencers, and on that theme, what could be more exciting that &lt;a href="https://www.netflix.com/nl-en/title/80234304" target="_blank" rel="noopener">Netflix’s biggest success&lt;/a> so far is not a detective series or a soap opera but a coming-of-age story of a female chess prodigy. Intelligence is sexy, and we are in the intelligence business.&lt;/p>
&lt;p>But to tell a more serious and more sobering number, I recently read with surprise that there are &lt;a href="https://www.theguardian.com/society/2021/may/27/number-of-smokers-has-reached-all-time-high-of-11-billion-study-finds" target="_blank" rel="noopener">more people smoking cigarettes&lt;/a> on Earth in 2021 than in 1990. Population growth in developing countries replaced the shrinking number of developed country smokers. While I live in Europe, where smoking is strongly declining, it reminds me that Europe’s population is a small part of the world. We cannot take for granted that our home-grown experiences about the world are globally valid.&lt;/p>
&lt;p>&lt;strong>Do you have a good example of really good, or really bad use of data?&lt;/strong>&lt;/p>
&lt;p>&lt;a href="https://fivethirtyeight.com/" target="_blank" rel="noopener">FiveThirtyEight.com&lt;/a> had a wonderful podcast series, produced by Jody Avirgan, called &lt;em>What’s the Point&lt;/em>. It is exactly about good and bad uses of data, and each episode is super interesting. Maybe the most memorable is &lt;em>Why the Bronx Really Burned&lt;/em>. New York City tried to measure fire response times, identify redundancies in service, and close or re-allocate fire stations accordingly. What resulted, though, was a perfect storm of bad data: The methodology was flawed, the analysis was rife with biases, and the results were interpreted in a way that stacked the deck against poorer neighborhoods. It is similar to many stories told in a very compelling argument by Catherine D’Ignazio and Lauren F. Klein in their much celebrated book, &lt;em>Data Feminism&lt;/em>. Usually, the bad use of data starts with a bad data collection practice. Data analysts in corporations, NGOs, public policy organizations and even in science usually analyze the data that is available.&lt;/p>
&lt;p>&lt;em>You can find these examples, together with many more that our contributors recommend, in the motivating examples of &lt;a href="https://contributors.dataobservatory.eu/data-curators.html#create-new-datasets" target="_blank" rel="noopener">Create New Datasets&lt;/a> and the &lt;a href="https://contributors.dataobservatory.eu/data-curators.html#critical-attitude" target="_blank" rel="noopener">Remain Critical&lt;/a> parts of our onboarding material. We hope that more and more professionals and citizen scientist will help us to create high-quality and open data.&lt;/em>&lt;/p>
&lt;p>The real power lies in designing a data collection program. A consistent data collection program usually requires an investment that only powerful organizations, such as government agencies, very large corporations, or the richest universities can afford. You cannot really analyze the data that is not collected and recorded; and usually what is not recorded is more interesting than what is. Our observatories want to democratize the data collection process and make it more available, more shared with research automation and pooling.&lt;/p>
&lt;figure id="figure-you-cannot-really-analyze-the-data-that-is-not-collected-and-recorded-and-usually-what-is-not-recorded-is-more-interesting-than-what-is-our-observatories-want-to-democratize-the-data-collection-process-and-make-it-more-available-more-shared-with-research-automation-and-pooling">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/img/slides/value_added_from_automation.png" alt="You cannot really analyze the data that is not collected and recorded; and usually what is not recorded is more interesting than what is. Our observatories want to democratize the data collection process and make it more available, more shared with research automation and pooling." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
You cannot really analyze the data that is not collected and recorded; and usually what is not recorded is more interesting than what is. Our observatories want to democratize the data collection process and make it more available, more shared with research automation and pooling.
&lt;/figcaption>&lt;/figure>
&lt;p>&lt;strong>From your perspective, what do you see being the greatest problem with open data in 2021?&lt;/strong>&lt;/p>
&lt;p>I have been involved with open data policies since 2004. The problem has not changed much: more and more data are available from governmental and scientific sources, but in a form that makes them useless. Data without clear description and clear processing information is useless for analytical purposes: it cannot be integrated with other data, and it cannot be trusted and verified. If researchers or government entities that fall under the &lt;a href="https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=uriserv:OJ.L_.2019.172.01.0056.01.ENG" target="_blank" rel="noopener">Open Data Directive&lt;/a> release data for reuse in a way that does not have descriptive or processing metadata, it is almost as if they did not release anything. You need this additional information to make valid analyses of the data, and to reverse-engineer them may cost more than to recollect the data in a properly documented process. Our developers, particularly &lt;a href="https://reprex-next.netlify.app/post/2021-06-04-developer-leo-lahti/">Leo&lt;/a> and &lt;a href="post/2021-06-07-data-curator-pyry-kantanen/">Pyry&lt;/a> are talking eloquently about why you have to be careful even with governmental statistical products, and constantly be on the watch out for data quality.&lt;/p>
&lt;figure id="figure-our-apidata-is-not-only-publishing-descriptive-and-processing-metadata-alongside-with-our-data-but-we-also-make-all-critical-elements-of-our-processing-code-available-for-peer-review-on-ropengovauthorsropengov">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/img/observatory_screenshots/EDO_API_metadata_table.png" alt="Our [API](/#data) is not only publishing descriptive and processing metadata alongside with our data, but we also make all critical elements of our processing code available for peer-review on [rOpenGov](/authors/ropengov/)" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Our &lt;a href="https://reprex-next.netlify.app/#data">API&lt;/a> is not only publishing descriptive and processing metadata alongside with our data, but we also make all critical elements of our processing code available for peer-review on &lt;a href="https://reprex-next.netlify.app/authors/ropengov/">rOpenGov&lt;/a>
&lt;/figcaption>&lt;/figure>
&lt;p>&lt;strong>What do you think the Economy Data Observatory, and our other automated observatories do, to make open data more credible in the European economic policy community and be accepted as verified information?&lt;/strong>&lt;/p>
&lt;p>Most of our work is in research automation, and a very large part of our efforts are aiming to reverse engineer missing descriptive and processing metadata. In a way, I like to compare ourselves to the working method of the open-source intelligence platform &lt;a href="https://www.bellingcat.com" target="_blank" rel="noopener">Bellingcat&lt;/a>. They were able to use publicly available, &lt;a href="https://www.bellingcat.com/category/resources/case-studies/?fwp_tags=mh17" target="_blank" rel="noopener">scattered information from satellites and social media&lt;/a> to identify each member of the Russian military company that illegally entered the territory of Ukraine and shot down the Malaysian Airways MH17 with 297, mainly Dutch, civilians on board.&lt;/p>
&lt;figure id="figure-how-we-create-value-for-research-oriented-consultancies-public-policy-institutes-university-research-teams-journalists-or-ngos">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/img/slides/automated_observatory_value_chain.jpg" alt="How we create value for research-oriented consultancies, public policy institutes, university research teams, journalists or NGOs." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
How we create value for research-oriented consultancies, public policy institutes, university research teams, journalists or NGOs.
&lt;/figcaption>&lt;/figure>
&lt;p>We do not do such investigations but work very similarly to them in how we are filtering through many data sources and attempting to verify them when their descriptions and processing history is unknown. In the last years, we were able to estore the metadata of many European and African open data surveys, economic impact, and environmental impact data, or many other open data that was lying around for many years without users.&lt;/p>
&lt;p>Open data is like gold in the mud below the chilly waves of mountain rivers. Panning it out requires a lot of patience, or a good machine. I think we will come to as surprising and strong findings as Bellingcat, but we are not focusing on individual events and stories, but on social and environmental processes and changes.&lt;/p>
&lt;figure id="figure-join-our-open-collaboration-economy-data-observatory-team-as-a-data-curatorauthorscurator-developerauthorsdeveloper-or-business-developerauthorsteam-or-share-your-data-in-our-public-repository-economy-data-observatory-on-zenodohttpszenodoorgcommunitieseconomy_observatory">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/img/observatory_screenshots/edo_and_zenodo.png" alt="Join our open collaboration Economy Data Observatory team as a [data curator](/authors/curator), [developer](/authors/developer) or [business developer](/authors/team), or share your data in our public repository [Economy Data Observatory on Zenodo](https://zenodo.org/communities/economy_observatory/)" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Join our open collaboration Economy Data Observatory team as a &lt;a href="https://reprex-next.netlify.app/authors/curator">data curator&lt;/a>, &lt;a href="https://reprex-next.netlify.app/authors/developer">developer&lt;/a> or &lt;a href="https://reprex-next.netlify.app/authors/team">business developer&lt;/a>, or share your data in our public repository &lt;a href="https://zenodo.org/communities/economy_observatory/" target="_blank" rel="noopener">Economy Data Observatory on Zenodo&lt;/a>
&lt;/figcaption>&lt;/figure>
&lt;h2 id="join-us">Join us&lt;/h2>
&lt;p>&lt;em>Join our open collaboration Economy Data Observatory team as a &lt;a href="https://reprex-next.netlify.app/authors/curator">data curator&lt;/a>, &lt;a href="https://reprex-next.netlify.app/authors/developer">developer&lt;/a> or &lt;a href="https://reprex-next.netlify.app/authors/team">business developer&lt;/a>. More interested in environmental impact analysis? Try our &lt;a href="https://greendeal.dataobservatory.eu/#contributors" target="_blank" rel="noopener">Green Deal Data Observatory&lt;/a> team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href="https://music.dataobservatory.eu/#contributors" target="_blank" rel="noopener">Digital Music Observatory&lt;/a> team!&lt;/em>&lt;/p></description></item><item><title>Where Are People More Likely To Treat Climate Change as the Most Serious Global Problem?</title><link>https://reprex-next.netlify.app/post/2021-03-06-individual-join/</link><pubDate>Sat, 06 Mar 2021 00:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-03-06-individual-join/</guid><description>&lt;pre>&lt;code>library(regions)
library(lubridate)
library(dplyr)
if ( dir.exists('data-raw') ) {
data_raw_dir &amp;lt;- &amp;quot;data-raw&amp;quot;
} else {
data_raw_dir &amp;lt;- file.path(&amp;quot;..&amp;quot;, &amp;quot;..&amp;quot;, &amp;quot;data-raw&amp;quot;)
}
&lt;/code>&lt;/pre>
&lt;p>The first results of our longitudinal table &lt;a href="post/2021-03-05-retroharmonize-climate/">were difficult to
map&lt;/a>, because the surveys used
an obsolete regional coding. We will adjust the wrong coding, when
possible, and join the data with the European Environment Agency’s (EEA)
Air Quality e-Reporting (AQ e-Reporting) data on environmental
pollution. We recoded the annual level for every available reporting
stations [&lt;em>not shown here&lt;/em>] and all values are in μg/m3. The period
under observation is 2014-2016. Data file:
&lt;a href="https://www.eea.europa.eu/data-and-maps/data/aqereporting-8" target="_blank" rel="noopener">https://www.eea.europa.eu/data-and-maps/data/aqereporting-8&lt;/a> (European
Environment Agency 2021).&lt;/p>
&lt;h2 id="recoding-the-regions">Recoding the Regions&lt;/h2>
&lt;p>Recoding means that the boundaries are unchanged, but the country
changed the names and codes of regions because there were other boundary
changes which did not affect our observation unit. We explain the
problem and the solution in greater detail in &lt;a href="http://netzero.dataobservatory.eu/post/2021-03-06-regions-climate/" target="_blank" rel="noopener">our
tutorial&lt;/a>
that aggregates the data on regional levels.&lt;/p>
&lt;pre>&lt;code>panel &amp;lt;- readRDS((file.path(data_raw_dir, &amp;quot;climate-panel.rds&amp;quot;)))
climate_data_geocode &amp;lt;- panel %&amp;gt;%
mutate ( year: lubridate::year(date_of_interview)) %&amp;gt;%
recode_nuts()
&lt;/code>&lt;/pre>
&lt;p>Let’s join the air pollution data and join it by corrected geocodes:&lt;/p>
&lt;pre>&lt;code>load(file.path(&amp;quot;data&amp;quot;, &amp;quot;air_pollutants.rda&amp;quot;)) ## good practice to use system-independent file.path
climate_awareness_air &amp;lt;- climate_data_geocode %&amp;gt;%
rename ( region_nuts_codes : .data$code_2016) %&amp;gt;%
left_join ( air_pollutants, by: &amp;quot;region_nuts_codes&amp;quot; ) %&amp;gt;%
select ( -all_of(c(&amp;quot;w1&amp;quot;, &amp;quot;wex&amp;quot;, &amp;quot;date_of_interview&amp;quot;,
&amp;quot;typology&amp;quot;, &amp;quot;typology_change&amp;quot;, &amp;quot;geo&amp;quot;, &amp;quot;region&amp;quot;))) %&amp;gt;%
mutate (
# remove special labels and create NA_numeric_
age_education: retroharmonize::as_numeric(age_education)) %&amp;gt;%
mutate_if ( is.character, as.factor) %&amp;gt;%
mutate (
# we only have responses from 4 years, and this should be treated as a categorical variable
year: as.factor(year)
) %&amp;gt;%
filter ( complete.cases(.) )
&lt;/code>&lt;/pre>
&lt;p>The &lt;code>climate_awareness_air&lt;/code> data frame contains the answers of 75086
individual respondents. 17.07% thought that climate change was the most
serious world problem and 33.6% mentioned climate change as one of the
three most important global problems.&lt;/p>
&lt;pre>&lt;code>summary ( climate_awareness_air )
## rowid serious_world_problems_first
## ZA5877_v2-0-0_1 : 1 Min. :0.0000
## ZA5877_v2-0-0_10 : 1 1st Qu.:0.0000
## ZA5877_v2-0-0_100 : 1 Median :0.0000
## ZA5877_v2-0-0_1000 : 1 Mean :0.1707
## ZA5877_v2-0-0_10000: 1 3rd Qu.:0.0000
## ZA5877_v2-0-0_10001: 1 Max. :1.0000
## (Other) :75080
## serious_world_problems_climate_change isocntry
## Min. :0.000 BE : 3028
## 1st Qu.:0.000 CZ : 3023
## Median :0.000 NL : 3019
## Mean :0.336 SK : 3000
## 3rd Qu.:1.000 SE : 2980
## Max. :1.000 DE-W : 2978
## (Other):57058
## marital_status age_education
## (Re-)Married: without children :13242 18 :15485
## (Re-)Married: children this marriage :12696 19 : 7728
## Single: without children : 7650 16 : 5840
## (Re-)Married: w children of this marriage: 6520 still studying: 5098
## (Re-)Married: living without children : 6225 17 : 5092
## Single: living without children : 4102 15 : 4528
## (Other) :24651 (Other) :31315
## age_exact occupation_of_respondent
## Min. :15.0 Retired, unable to work :22911
## 1st Qu.:36.0 Skilled manual worker : 6774
## Median :51.0 Employed position, at desk : 6716
## Mean :50.1 Employed position, service job: 5624
## 3rd Qu.:65.0 Middle management, etc. : 5252
## Max. :99.0 Student : 5098
## (Other) :22711
## occupation_of_respondent_recoded
## Employed (10-18 in d15a) :32763
## Not working (1-4 in d15a) :37125
## Self-employed (5-9 in d15a): 5198
##
##
##
##
## respondent_occupation_scale_c_14
## Retired (4 in d15a) :22911
## Manual workers (15 to 18 in d15a) :15269
## Other white collars (13 or 14 in d15a): 9203
## Managers (10 to 12 in d15a) : 8291
## Self-employed (5 to 9 in d15a) : 5198
## Students (2 in d15a) : 5098
## (Other) : 9116
## type_of_community is_student no_education
## DK : 34 Min. :0.0000 Min. :0.000000
## Large town :20939 1st Qu.:0.0000 1st Qu.:0.000000
## Rural area or village :24686 Median :0.0000 Median :0.000000
## Small or middle sized town: 9850 Mean :0.0679 Mean :0.008151
## Small/middle town :19577 3rd Qu.:0.0000 3rd Qu.:0.000000
## Max. :1.0000 Max. :1.000000
##
## education year region_nuts_codes country_code
## Min. :14.00 2013:25103 LU : 1432 DE : 4531
## 1st Qu.:17.00 2015: 0 MT : 1398 GB : 3538
## Median :18.00 2017:25053 CY : 1192 BE : 3028
## Mean :19.61 2019:24930 SK02 : 1053 CZ : 3023
## 3rd Qu.:22.00 EL30 : 974 NL : 3019
## Max. :30.00 EE : 973 SK : 3000
## (Other):68064 (Other):54947
## pm2_5 pm10 o3 BaP
## Min. : 2.109 Min. : 5.883 Min. : 66.37 Min. :0.0102
## 1st Qu.: 9.374 1st Qu.: 28.326 1st Qu.: 90.89 1st Qu.:0.1779
## Median :11.866 Median : 33.673 Median :102.81 Median :0.4105
## Mean :12.954 Mean : 38.637 Mean :101.49 Mean :0.8759
## 3rd Qu.:15.890 3rd Qu.: 49.488 3rd Qu.:110.73 3rd Qu.:1.0692
## Max. :41.293 Max. :123.239 Max. :141.04 Max. :7.8050
##
## so2 ap_pc1 ap_pc2 ap_pc3
## Min. : 0.0000 Min. :-4.6669 Min. :-2.21851 Min. :-2.1007
## 1st Qu.: 0.0000 1st Qu.:-0.4624 1st Qu.:-0.49130 1st Qu.:-0.5695
## Median : 0.0000 Median : 0.4263 Median : 0.02902 Median :-0.1113
## Mean : 0.1032 Mean : 0.1031 Mean : 0.04166 Mean :-0.1746
## 3rd Qu.: 0.0000 3rd Qu.: 0.9748 3rd Qu.: 0.57416 3rd Qu.: 0.3309
## Max. :42.5325 Max. : 2.0344 Max. : 3.25841 Max. : 4.1615
##
## ap_pc4 ap_pc5
## Min. :-1.7387 Min. :-2.75079
## 1st Qu.:-0.1669 1st Qu.:-0.18748
## Median : 0.0371 Median : 0.01811
## Mean : 0.1154 Mean : 0.06797
## 3rd Qu.: 0.3050 3rd Qu.: 0.34937
## Max. : 3.2476 Max. : 1.42816
##
&lt;/code>&lt;/pre>
&lt;p>Let’s see a simple CART tree! We remove the regional codes, because
there are very serious differences among regional climate awareness.
These differences, together with education level, and the year we are
talking about, are the most important predictors of thinking about
climate change as the most important global problem in Europe.&lt;/p>
&lt;pre>&lt;code># Classification Tree with rpart
library(rpart)
# grow tree
fit &amp;lt;- rpart(as.factor(serious_world_problems_first) ~ .,
method=&amp;quot;class&amp;quot;, data=climate_awareness_air %&amp;gt;%
select ( - all_of(c(&amp;quot;rowid&amp;quot;, &amp;quot;region_nuts_codes&amp;quot;))),
control: rpart.control(cp: 0.005))
printcp(fit) # display the results
##
## Classification tree:
## rpart(formula: as.factor(serious_world_problems_first) ~ .,
## data: climate_awareness_air %&amp;gt;% select(-all_of(c(&amp;quot;rowid&amp;quot;,
## &amp;quot;region_nuts_codes&amp;quot;))), method: &amp;quot;class&amp;quot;, control: rpart.control(cp: 0.005))
##
## Variables actually used in tree construction:
## [1] age_education isocntry
## [3] serious_world_problems_climate_change year
##
## Root node error: 12817/75086: 0.1707
##
## n= 75086
##
## CP nsplit rel error xerror xstd
## 1 0.0240566 0 1.00000 1.00000 0.0080438
## 2 0.0082703 3 0.92783 0.92783 0.0078055
## 3 0.0050000 5 0.91129 0.91425 0.0077588
plotcp(fit) # visualize cross-validation results
&lt;/code>&lt;/pre>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="&amp;amp;ldquo;Visualize cross-validation results&amp;amp;rdquo;" srcset="
/post/2021-03-06-individual-join/rpart-1_hu9f1f775a32eec3a67a573c0d2df50ef4_4271_8ce48ac0f7ba6b1d3752385b96368cc3.webp 400w,
/post/2021-03-06-individual-join/rpart-1_hu9f1f775a32eec3a67a573c0d2df50ef4_4271_b20e6dca7fcadd4576da216956498a35.webp 760w,
/post/2021-03-06-individual-join/rpart-1_hu9f1f775a32eec3a67a573c0d2df50ef4_4271_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/post/2021-03-06-individual-join/rpart-1_hu9f1f775a32eec3a67a573c0d2df50ef4_4271_8ce48ac0f7ba6b1d3752385b96368cc3.webp"
width="672"
height="480"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;pre>&lt;code>summary(fit) # detailed summary of splits
## Call:
## rpart(formula: as.factor(serious_world_problems_first) ~ .,
## data: climate_awareness_air %&amp;gt;% select(-all_of(c(&amp;quot;rowid&amp;quot;,
## &amp;quot;region_nuts_codes&amp;quot;))), method: &amp;quot;class&amp;quot;, control: rpart.control(cp: 0.005))
## n= 75086
##
## CP nsplit rel error xerror xstd
## 1 0.024056592 0 1.0000000 1.0000000 0.008043837
## 2 0.008270266 3 0.9278302 0.9278302 0.007805478
## 3 0.005000000 5 0.9112897 0.9142545 0.007758824
##
## Variable importance
## serious_world_problems_climate_change isocntry
## 31 26
## country_code BaP
## 20 8
## pm2_5 ap_pc1
## 4 3
## age_education pm10
## 2 2
## education ap_pc2
## 2 1
## year
## 1
##
## Node number 1: 75086 observations, complexity param=0.02405659
## predicted class=0 expected loss=0.1706976 P(node): 1
## class counts: 62269 12817
## probabilities: 0.829 0.171
## left son=2 (25229 obs) right son=3 (49857 obs)
## Primary splits:
## serious_world_problems_climate_change &amp;lt; 0.5 to the right, improve=2214.2040, (0 missing)
## isocntry splits as RRLLLRRRLLRLRLLLLLLLLLLRRLLLRLL, improve= 728.0160, (0 missing)
## country_code splits as RRLLLRRLLRLLLLLLLLLLRRLLLRLL, improve= 673.3656, (0 missing)
## BaP &amp;lt; 0.4300347 to the right, improve= 310.6229, (0 missing)
## pm2_5 &amp;lt; 13.38264 to the right, improve= 296.4013, (0 missing)
## Surrogate splits:
## age_education splits as ----RRRRRR-RRRRRRRRRR-RRRRRRRRRR-RRRRRRRRRR-RRRRRRRRRR-RRRRRL-RRR-RRRRRRRRR--RRRLLR--R-R, agree=0.664, adj=0, (0 split)
## pm10 &amp;lt; 7.491315 to the left, agree=0.664, adj=0, (0 split)
##
## Node number 2: 25229 observations
## predicted class=0 expected loss=0 P(node): 0.3360014
## class counts: 25229 0
## probabilities: 1.000 0.000
##
## Node number 3: 49857 observations, complexity param=0.02405659
## predicted class=0 expected loss=0.2570752 P(node): 0.6639986
## class counts: 37040 12817
## probabilities: 0.743 0.257
## left son=6 (34631 obs) right son=7 (15226 obs)
## Primary splits:
## isocntry splits as RRLLLRRRLLRLRLLLLLLLLLLRRLLLRLL, improve=1454.9460, (0 missing)
## country_code splits as RRLLLRRLLRLLLLLLLLLLRRLLLRLL, improve=1359.7210, (0 missing)
## BaP &amp;lt; 0.4300347 to the right, improve= 629.8844, (0 missing)
## pm2_5 &amp;lt; 13.38264 to the right, improve= 555.7484, (0 missing)
## ap_pc1 &amp;lt; -0.005459537 to the left, improve= 533.3579, (0 missing)
## Surrogate splits:
## country_code splits as RRLLLRRLLRLLLLLLLLLLRRLLLRLL, agree=0.987, adj=0.957, (0 split)
## BaP &amp;lt; 0.1749425 to the right, agree=0.775, adj=0.264, (0 split)
## pm2_5 &amp;lt; 5.206993 to the right, agree=0.737, adj=0.140, (0 split)
## ap_pc1 &amp;lt; 1.405527 to the left, agree=0.733, adj=0.126, (0 split)
## pm10 &amp;lt; 25.31211 to the right, agree=0.718, adj=0.076, (0 split)
##
## Node number 6: 34631 observations
## predicted class=0 expected loss=0.1769802 P(node): 0.4612178
## class counts: 28502 6129
## probabilities: 0.823 0.177
##
## Node number 7: 15226 observations, complexity param=0.02405659
## predicted class=0 expected loss=0.4392487 P(node): 0.2027808
## class counts: 8538 6688
## probabilities: 0.561 0.439
## left son=14 (11607 obs) right son=15 (3619 obs)
## Primary splits:
## isocntry splits as LL---LLR--L-L----------LL---R--, improve=337.5462, (0 missing)
## country_code splits as LL---LR--L-L--------LL---R--, improve=337.5462, (0 missing)
## age_education splits as ----LLLLLL-LLLRRRRRRR-RRRRRRRRRL-RRRRRRLLRR-RRRRLLRLRL-RRLRRR-RRR-LLLLRRR-----LR-----L-R, improve=294.0807, (0 missing)
## education &amp;lt; 22.5 to the left, improve=262.3747, (0 missing)
## BaP &amp;lt; 0.053328 to the right, improve=232.7043, (0 missing)
## Surrogate splits:
## BaP &amp;lt; 0.053328 to the right, agree=0.878, adj=0.485, (0 split)
## pm2_5 &amp;lt; 4.810361 to the right, agree=0.827, adj=0.271, (0 split)
## ap_pc2 &amp;lt; 0.8746175 to the left, agree=0.792, adj=0.124, (0 split)
## so2 &amp;lt; 0.3302972 to the left, agree=0.781, adj=0.078, (0 split)
## age_education splits as ----LLLLLL-LLLLLLLRLR-LRRLRRRRRR-RRRRLLLLLR-LRLRLLRRLL-LLRLLR-LLR-RRLLLLL-----RR-----R-L, agree=0.779, adj=0.071, (0 split)
##
## Node number 14: 11607 observations, complexity param=0.008270266
## predicted class=0 expected loss=0.3804601 P(node): 0.1545827
## class counts: 7191 4416
## probabilities: 0.620 0.380
## left son=28 (7462 obs) right son=29 (4145 obs)
## Primary splits:
## age_education splits as ----LLLLLL-LRRRRRRRRR-RRLRRLRRLL-RRRRLRLLRR-RLRLLLRLRL-RR-RR--RRL-L-LLRRR------------L-R, improve=123.71070, (0 missing)
## year splits as R-LR, improve=107.79460, (0 missing)
## education &amp;lt; 20.5 to the left, improve= 90.28724, (0 missing)
## occupation_of_respondent splits as LRRLRRRRRLRLLLRLLL, improve= 84.62865, (0 missing)
## respondent_occupation_scale_c_14 splits as LRLLLRRL, improve= 68.88653, (0 missing)
## Surrogate splits:
## education &amp;lt; 20.5 to the left, agree=0.950, adj=0.861, (0 split)
## occupation_of_respondent splits as LLLLRLLRRLRLLLRLLL, agree=0.738, adj=0.267, (0 split)
## respondent_occupation_scale_c_14 splits as LRLLLLRL, agree=0.733, adj=0.251, (0 split)
## is_student &amp;lt; 0.5 to the left, agree=0.709, adj=0.186, (0 split)
## age_exact &amp;lt; 23.5 to the right, agree=0.676, adj=0.094, (0 split)
##
## Node number 15: 3619 observations
## predicted class=1 expected loss=0.3722023 P(node): 0.04819807
## class counts: 1347 2272
## probabilities: 0.372 0.628
##
## Node number 28: 7462 observations
## predicted class=0 expected loss=0.326052 P(node): 0.09937938
## class counts: 5029 2433
## probabilities: 0.674 0.326
##
## Node number 29: 4145 observations, complexity param=0.008270266
## predicted class=0 expected loss=0.4784077 P(node): 0.05520337
## class counts: 2162 1983
## probabilities: 0.522 0.478
## left son=58 (2573 obs) right son=59 (1572 obs)
## Primary splits:
## year splits as L-LR, improve=40.13885, (0 missing)
## occupation_of_respondent splits as LRLLRRRRRLRLLLRLLL, improve=18.33254, (0 missing)
## marital_status splits as LRRRLRRRLRRLRLLRRRRRRLRLRLLRR, improve=17.86888, (0 missing)
## type_of_community splits as LRLRL, improve=17.55254, (0 missing)
## age_education splits as ------------LLRRRRRRR-RR-RL-RR---LRRR-R--LR-R-R---R-R--RR-RR--RR------RRR--------------R, improve=14.66121, (0 missing)
## Surrogate splits:
## type_of_community splits as LLLRL, agree=0.777, adj=0.412, (0 split)
## marital_status splits as RRLLLLLRLLLLLLLRRRLLLLLLRLRLL, agree=0.680, adj=0.155, (0 split)
## isocntry splits as LL---LL---L-R----------LL------, agree=0.669, adj=0.127, (0 split)
## country_code splits as LL---L---L-R--------LL------, agree=0.669, adj=0.127, (0 split)
## o3 &amp;lt; 83.06345 to the right, agree=0.650, adj=0.076, (0 split)
##
## Node number 58: 2573 observations
## predicted class=0 expected loss=0.4240187 P(node): 0.03426737
## class counts: 1482 1091
## probabilities: 0.576 0.424
##
## Node number 59: 1572 observations
## predicted class=1 expected loss=0.43257 P(node): 0.02093599
## class counts: 680 892
## probabilities: 0.433 0.567
# plot tree
plot(fit, uniform=TRUE,
main=&amp;quot;Classification Tree: Climate Change Is The Most Serious Threat&amp;quot;)
text(fit, use.n=TRUE, all=TRUE, cex=.8)
## Warning in labels.rpart(x, minlength: minlength): more than 52 levels in a
## predicting factor, truncated for printout
&lt;/code>&lt;/pre>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="&amp;amp;ldquo;predicting factor, truncated for printout&amp;amp;rdquo;" srcset="
/post/2021-03-06-individual-join/rpart-2_hu8765078af843fd2a25e4b77d7cba4bfb_9882_0bdd94d7f6c1efcc2575c1adeb6917c8.webp 400w,
/post/2021-03-06-individual-join/rpart-2_hu8765078af843fd2a25e4b77d7cba4bfb_9882_daf3b553e16b54a4b23a242bc9ef1e6b.webp 760w,
/post/2021-03-06-individual-join/rpart-2_hu8765078af843fd2a25e4b77d7cba4bfb_9882_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://reprex-next.netlify.app/post/2021-03-06-individual-join/rpart-2_hu8765078af843fd2a25e4b77d7cba4bfb_9882_0bdd94d7f6c1efcc2575c1adeb6917c8.webp"
width="672"
height="480"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;pre>&lt;code>saveRDS ( climate_awareness_air , file.path(tempdir(), &amp;quot;climate_panel_recoded.rds&amp;quot;), version: 2)
# not evaluated
saveRDS( climate_awareness_air, file: file.path(&amp;quot;data-raw&amp;quot;, &amp;quot;climate-panel_recoded.rds&amp;quot;))
&lt;/code>&lt;/pre></description></item><item><title>What is Retrospective Survey Harmonization?</title><link>https://reprex-next.netlify.app/post/2021-03-04_retroharmonize_intro/</link><pubDate>Thu, 04 Mar 2021 00:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-03-04_retroharmonize_intro/</guid><description>&lt;h2 id="reproducible-ex-post-harmonization-of-survey-microdata">Reproducible ex post harmonization of survey microdata&lt;/h2>
&lt;p>Retrospective survey harmonization allows the comparison of opinion poll
data conducted in different countries or time. In this example we are
working with data from surveys that were ex ante harmonized to a certain
degree – in our tutorials we are choosing questions that were asked in
the same way in many natural languages. For example, you can compare
what percentage of the European people in various countries, provinces
and regions thought climate change was a serious world problem back in
2013, 2015, 2017 and 2019.&lt;/p>
&lt;p>We developed the
&lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize&lt;/a> R package
to help this process. We have tested the package with about 80
Eurobarometer, 5 Afrobarometer survey files extensively, and a bit with
Arabbarometer files. This allows the comparison of various survey
answers in about 70 countries. This policy-oriented survey programs were
designed to be harmonized to a certain degree, but their ex post
harmonization is still necessary, challenging and errorprone.
Retrospective harmonization includes harmonization of the different
coding used for questions and answer options, post-stratification
weights, and using different file formats.&lt;/p>
&lt;p>&lt;a href="https://ec.europa.eu/commfrontoffice/publicopinion/index.cfm" target="_blank" rel="noopener">Eurobarometer&lt;/a>,
&lt;a href="https://www.afrobarometer.org/" target="_blank" rel="noopener">Afrobaromer&lt;/a>, &lt;a href="https://www.arabbarometer.org/" target="_blank" rel="noopener">Arab
Barometer&lt;/a> and
&lt;a href="https://www.latinobarometro.org/lat.jsp" target="_blank" rel="noopener">Latinobarómetro&lt;/a> make survey
files that are harmonized across countries available for research with
various terms. Our
&lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize&lt;/a> is not
affiliated with them, and to run our examples, you must visit their
websites, carefully read their terms, agree to them, and download their
data yourself. What we add as a value is that we help to connect their
files across time (from different years) or across these programs.&lt;/p>
&lt;p>The survey programs mentioned above publish their data in the
proprietary SPSS format. This file format can be imported and translated
to R objects with the haven package; however, we needed to re-design
&lt;a href="https://haven.tidyverse.org/" target="_blank" rel="noopener">haven’s&lt;/a>
&lt;a href="https://haven.tidyverse.org/reference/labelled_spss.html" target="_blank" rel="noopener">labelled_spss&lt;/a>
class to maintain far more metadata, which, in turn, a modification of
the &lt;a href="">labelled&lt;/a> class. The haven package was designed and tested with
data stored in individual SPSS files.&lt;/p>
&lt;p>The author of labelled, Joseph Larmarange describes two main approaches
to work with labelled data, such as SPSS’s method to store categorical
data in the &lt;a href="http://larmarange.github.io/labelled/articles/intro_labelled.html" target="_blank" rel="noopener">Introduction to
labelled&lt;/a>.&lt;/p>
&lt;figure id="figure-two-main-approaches-of-labelled-data-conversion">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="img/larmarange_approaches_to_labelled.png" alt="Two main approaches of labelled data conversion." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Two main approaches of labelled data conversion.
&lt;/figcaption>&lt;/figure>
&lt;p>Our approach is a further extension of &lt;strong>Approach B&lt;/strong>. Survey
harmonization in our case always means the joining data from several
SPSS files, which requires a consistent coding among several data
sources. This means that data cleaning and recoding must take place
before conversion to factors, character or numeric vectors. This is
particularly important with factor data (and their simple character
conversions) and numeric data that occasionally contains labels, for
example, to describe the reason why certain data is missing. Our
tutorial vignette
&lt;a href="https://retroharmonize.dataobservatory.eu/articles/labelled_spss_survey.html" target="_blank" rel="noopener">labelled_spss_survey&lt;/a>
gives you more information about this.&lt;/p>
&lt;p>In the next series of tutorials, we will deal with an array of problems.
These are not for the faint heart – you need to have a solid
intermediate level of R to follow.&lt;/p>
&lt;h2 id="tidy-joined-survey-data">Tidy, joined survey data&lt;/h2>
&lt;ul>
&lt;li>The original files identifiers may not be unique, we have to create
new, truly unique identifiers. Weighting may not be straightforward.&lt;/li>
&lt;li>Neither the number of observations or the number of variables (which
represents the survey questions and their translation to coded data)
is the same. Certain data may be only present in one survey and not
the other. This means that you will likely to run loops on lists and
not data.frames, but eventually you must carefully join them.&lt;/li>
&lt;/ul>
&lt;h2 id="class-conversion">Class conversion&lt;/h2>
&lt;ul>
&lt;li>Similar questions may be imported from a non-native R format, in our
case, from an SPSS files, in an inconsistent manner. SPSS’s variable
formats cannot be translated unambiguously to R classes.
&lt;code>retroharmonize&lt;/code> introduced a new S3 class system that handles this
problem, but eventually you will have to choose if you want to see a
numeric or character coding of each categorical variable.&lt;/li>
&lt;li>The harmonized surveys, with harmonized variable names and
harmonized value labels, must be brought to consistent R
representations (most statistical functions will only work on
numeric, factor or character data) and carefully joined into a
single data table for analysis.&lt;/li>
&lt;/ul>
&lt;h2 id="harmonization-of-variables-and-variable-labels">Harmonization of variables and variable labels&lt;/h2>
&lt;ul>
&lt;li>Same variables may come with dissimilar variable names and variable
labels. It may be a challenge to match age with age. We need to
harmonize the names of variables.&lt;/li>
&lt;li>The harmonized variables may have different labeling. One may call
refused answers as &lt;code>declined&lt;/code> and the other &lt;code>refusal&lt;/code>. On a simple
choice, climate change may be ‘Climate change’ or
&lt;code>Problem: Climate change&lt;/code>. Binary choices may have survey-specific
coding conventions. Value labels must be harmonized. There are good
tools to do this in a single file - but we have to work with several
of them.&lt;/li>
&lt;/ul>
&lt;h2 id="missing-value-harmonization">Missing value harmonization&lt;/h2>
&lt;ul>
&lt;li>There are likely to be various types of &lt;code>missing values&lt;/code>. Working
with missing values is probably where most human judgment is needed.
Why are some answers missing: was the question not asked in some
questionnaires? Is there a coding error? Did the respondent refuse
the question, or sad that she did not have an answer?
&lt;code>retroharmonize&lt;/code> has a special labeled vector type that retains this
information from the raw data, if it is present, but you must make
the judgment yourself – in R, eventually you will either create a
missing category, or use &lt;code>NA_character_&lt;/code> or &lt;code>NA_real_&lt;/code>.&lt;/li>
&lt;/ul>
&lt;p>That’s a lot to put on your plate.&lt;/p>
&lt;p>It is unlikely that you will be able to work with completely unfamiliar
survey programs if you do not have a strong intermediate level of R. Our
package comes with tutorials for
&lt;a href="https://retroharmonize.dataobservatory.eu/articles/eurobarometer.html" target="_blank" rel="noopener">Eurobarometer&lt;/a>,
&lt;a href="https://retroharmonize.dataobservatory.eu/articles/afrobarometer.html" target="_blank" rel="noopener">Afrobarometer&lt;/a>
and our development version already covers Arab Barometer, highlighting
some peculiar issues with these survey programs, that we hope to give a
head start for less experienced R users.&lt;/p></description></item><item><title>Eurobarometer Surveys Used In Our Project</title><link>https://reprex-next.netlify.app/post/2021-03-04-eurobarometer_data/</link><pubDate>Wed, 03 Mar 2021 00:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-03-04-eurobarometer_data/</guid><description>&lt;p>In our &lt;a href="https://reprex-next.netlify.app/post/2021-03-04_retroharmonize_intro/">tutorial
series&lt;/a>,
we are going to harmonize the following questionnaire items from five
Eurobarometer harmonized survey files. The Eurobarometer survey files
are harmonized across countries, but they are only partially harmonized
in time.&lt;/p>
&lt;p>All data must be downloaded from the
&lt;a href="https://www.gesis.org/en/home" target="_blank" rel="noopener">GESIS&lt;/a> Data Archive in Cologne. We are
not affiliated with GESIS and you must read and accept their terms to
use the data.&lt;/p>
&lt;h2 id="eurobarometer-802-2013">Eurobarometer 80.2 (2013)&lt;/h2>
&lt;p>GESIS Data Archive, Cologne. ZA5877 Data file Version 2.0.0,
&lt;a href="https://doi.org/10.4232/1.12792" target="_blank" rel="noopener">https://doi.org/10.4232/1.12792&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Data file: &lt;a href="https://search.gesis.org/research_data/ZA5877" target="_blank" rel="noopener">ZA6595&lt;/a>
data file (European Commission 2017).&lt;/li>
&lt;li>Questionnaire: &lt;a href="https://dbk.gesis.org/dbksearch/download.asp?id=54036" target="_blank" rel="noopener">Eurobarometer 83.4 Basic Bilingual
Questionnaire&lt;/a>&lt;/li>
&lt;li>Citation: &lt;a href="https://search.gesis.org/ajax/bibtex.php?type=research_data&amp;amp;docid=ZA5877&amp;amp;lang=en" target="_blank" rel="noopener">ZA6595
Bibtex&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>&lt;code>QA1a Which of the following do you consider to be the single most serious problem facing the world as a whole?&lt;/code>
(single choice)&lt;/p>
&lt;p>&lt;code>QA1b Which others do you consider to be serious problems?&lt;/code> (multiple
choice)&lt;/p>
&lt;p>&lt;code>QA2 And how serious a problem do you think climate change is at this moment? Please use a scale from 1 to 10, with '1' meaning it is &amp;quot;not at all a serious problem&lt;/code>
(scale 1-10)&lt;/p>
&lt;p>&lt;code>QA4 To what extent do you agree or disagree with each of the following statements? - Fighting climate change and using energy more efficiently can boost the economy and jobs in the EU&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>QA4 To what extent do you agree or disagree with each of the following statements? - Reducing fossil fuel imports from outside the EU could benefit the EU economically&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>QA5 Have you personally taken any action to fight climate change over the past six months?&lt;/code>
(binary)&lt;/p>
&lt;h2 id="eurobarometer-834-2015">Eurobarometer 83.4 (2015)&lt;/h2>
&lt;p>European Commission, Brussels; Directorate General Communication
COMM.A.1 ´Strategy, Corporate Communication Actions and
Eurobarometer´GESIS Data Archive, Cologne. ZA6595 Data file Version
3.0.0, &lt;a href="https://doi.org/10.4232/1.13146" target="_blank" rel="noopener">https://doi.org/10.4232/1.13146&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Data file: &lt;a href="https://search.gesis.org/research_data/ZA6595" target="_blank" rel="noopener">ZA6595&lt;/a>
data file (European Commission 2018).&lt;/li>
&lt;li>Questionnaire: &lt;a href="https://dbk.gesis.org/dbksearch/download.asp?id=57940" target="_blank" rel="noopener">Eurobarometer 83.4 Basic Bilingual
Questionnaire&lt;/a>&lt;/li>
&lt;li>Citation: &lt;a href="https://search.gesis.org/ajax/bibtex.php?type=research_data&amp;amp;docid=ZA6595&amp;amp;lang=en" target="_blank" rel="noopener">ZA6595
Bibtex&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="eurobarometer-871-2017">Eurobarometer 87.1 (2017)&lt;/h2>
&lt;p>European Commission, Brussels; Directorate General Communication,
COMM.A.1 ‘Strategic Communication’; European Parliament,
Directorate-General for Communication, Public Opinion Monitoring Unit
GESIS Data Archive, Cologne. ZA6861 Data file Version 1.2.0,
&lt;a href="https://doi.org/10.4232/1.12922" target="_blank" rel="noopener">https://doi.org/10.4232/1.12922&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Data file: &lt;a href="https://search.gesis.org/research_data/ZA6861" target="_blank" rel="noopener">ZA6861&lt;/a>
data file.&lt;/li>
&lt;li>Questionnaire: &lt;a href="https://dbk.gesis.org/dbksearch/download.asp?id=65967" target="_blank" rel="noopener">Eurobarometer 90.2 Basic Bilingual
Questionnaire&lt;/a>&lt;/li>
&lt;li>Citation: &lt;a href="https://search.gesis.org/ajax/bibtex.php?type=research_data&amp;amp;docid=ZA6861&amp;amp;lang=en" target="_blank" rel="noopener">ZA6861
Bibtex&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>&lt;code>QC1a Which of the following do you consider to be the single most serious problem facing the world as a whole?&lt;/code>
(single choice)&lt;/p>
&lt;p>&lt;code>QC1b Which others do you consider to be serious problems?&lt;/code> (multiple
choice)&lt;/p>
&lt;p>&lt;code>QC2 And how serious a problem do you think climate change is at this moment? Please use a scale from 1 to 10, with '1' meaning it is &amp;quot;not at all a serious problem&lt;/code>
(scale 1-10)&lt;/p>
&lt;p>&lt;code>Qc4 To what extent do you agree or disagree with each of the following statements? - Fighting climate change and using energy more efficiently can boost the economy and jobs in the EU&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>Qc4 To what extent do you agree or disagree with each of the following statements? - Promoting EU expertise in new clean technologies to countries outside the EU can benefit the EU economically&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>Qc4 To what extent do you agree or disagree with each of the following statements? - Reducing fossil fuel imports from outside the EU can benefit the EU economically&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>Qc4 To what extent do you agree or disagree with each of the following statements? - Reducing fossil fuel imports from outside the EU can increase the security of EU energy supplies&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>Qc4 To what extent do you agree or disagree with each of the following statements? - More public financial support should be given to the transition to clean energies even if it means subsidies to fossil fuels should be reduced.&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>Qc5 Have you personally taken any action to fight climate change over the past six months?&lt;/code>
(binary)&lt;/p>
&lt;h2 id="eurobarometer-902-2018">Eurobarometer 90.2 (2018)&lt;/h2>
&lt;p>European Commission, Brussels; Directorate General Communication,
COMM.A.3 ‘Media Monitoring and Eurobarometer’ GESIS Data Archive,
Cologne. ZA7488 Data file Version 1.0.0,
&lt;a href="https://doi.org/10.4232/1.13289" target="_blank" rel="noopener">https://doi.org/10.4232/1.13289&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Data file:
&lt;a href="https://dbk.gesis.org/dbksearch/sdesc2.asp?db=e&amp;amp;no=7488" target="_blank" rel="noopener">ZA7488&lt;/a>
data file (European Commission 2019a)&lt;/li>
&lt;li>Questionnaire: &lt;a href="https://dbk.gesis.org/dbksearch/download.asp?id=65967" target="_blank" rel="noopener">Eurobarometer 90.2 Basic Bilingual
Questionnaire&lt;/a>&lt;/li>
&lt;li>Citation: &lt;a href="https://search.gesis.org/ajax/bibtex.php?type=research_data&amp;amp;docid=ZA7488&amp;amp;lang=en" target="_blank" rel="noopener">ZA7488
Bibtex&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>&lt;code>QB5 To what extent do you agree or disagree with each of the following statements? - Fighting climate change and using energy more efficiently can boost the economy and jobs in the EU&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>QB5 To what extent do you agree or disagree with each of the following statements? - Promoting EU expertise in new clean technologies to countries outside the EU can benefit the EU economically&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>QB5 To what extent do you agree or disagree with each of the following statements? - Reducing fossil fuel imports from outside the EU can benefit the EU economically&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>QB5 To what extent do you agree or disagree with each of the following statements? - Reducing fossil fuel imports from outside the EU can increase the security of EU energy supplies&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>QB5 To what extent do you agree or disagree with each of the following statements? - More public financial support should be given to the transition to clean energies even if it means subsidies to fossil fuels should be reduced.&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;h2 id="eurobarometer-913-2019">Eurobarometer 91.3 (2019)&lt;/h2>
&lt;p>European Commission, Brussels; Directorate General Communication,
COMM.A.3 ‘Media Monitoring and Eurobarometer’ GESIS Data Archive,
Cologne. ZA7572 Data file Version 1.0.0,
&lt;a href="https://doi.org/10.4232/1.13372" target="_blank" rel="noopener">https://doi.org/10.4232/1.13372&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Data file:
&lt;a href="https://dbk.gesis.org/dbksearch/sdesc2.asp?db=e&amp;amp;no=7572" target="_blank" rel="noopener">ZA7572&lt;/a>
data file (European Commission 2019b).&lt;/li>
&lt;li>Questionnaire: &lt;a href="https://dbk.gesis.org/dbksearch/download.asp?id=66774" target="_blank" rel="noopener">Eurobarometer 91.3 Basic Bilingual
Questionnaire&lt;/a>&lt;/li>
&lt;li>Citation: &lt;a href="https://search.gesis.org/ajax/bibtex.php?type=research_data&amp;amp;docid=ZA7572&amp;amp;lang=en" target="_blank" rel="noopener">ZA7572
Bibtex&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>&lt;code>QB4 To what extent do you agree or disagree with each of the following statements? - Taking action on climate change will lead to innovation that will make EU companies more competitive (N)&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>QB4 To what extent do you agree or disagree with each of the following statements? - Promoting EU expertise in new clean technologies to countries outside the EU can benefit the EU economically&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>QB4 To what extent do you agree or disagree with each of the following statements? - Reducing fossil fuel imports from outside the EU can benefit the EU economically&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>QB4 To what extent do you agree or disagree with each of the following statements? - Adapting to the adverse impacts of climate change can have positive outcomes for citizens in the EU&lt;/code>
(agreement-disagreement 4-scale)&lt;/p>
&lt;p>&lt;code>QB5 Have you personally taken any action to fight climate change over the past six months?&lt;/code>
(binary)&lt;/p>
&lt;h2 id="references">References&lt;/h2>
&lt;p>European Commission, Brussels. 2017. “Eurobarometer 80.2 (2013).” GESIS
Data Archive, Cologne. ZA5877 Data file Version 2.0.0,
&lt;a href="https://doi.org/10.4232/1.12792" target="_blank" rel="noopener">https://doi.org/10.4232/1.12792&lt;/a>. &lt;a href="https://doi.org/10.4232/1.12792" target="_blank" rel="noopener">https://doi.org/10.4232/1.12792&lt;/a>.&lt;/p>
&lt;p>———. 2018. “Eurobarometer 83.4 (2015).” GESIS Data Archive, Cologne.
ZA6595 Data file Version 3.0.0, &lt;a href="https://doi.org/10.4232/1.13146" target="_blank" rel="noopener">https://doi.org/10.4232/1.13146&lt;/a>.
&lt;a href="https://doi.org/10.4232/1.13146" target="_blank" rel="noopener">https://doi.org/10.4232/1.13146&lt;/a>.&lt;/p>
&lt;p>———. 2019a. “Eurobarometer 90.2 (2018).” GESIS Data Archive, Cologne.
ZA7488 Data file Version 1.0.0, &lt;a href="https://doi.org/10.4232/1.13289" target="_blank" rel="noopener">https://doi.org/10.4232/1.13289&lt;/a>.
&lt;a href="https://doi.org/10.4232/1.13289" target="_blank" rel="noopener">https://doi.org/10.4232/1.13289&lt;/a>.&lt;/p>
&lt;p>———. 2019b. “Eurobarometer 91.3 (2019).” GESIS Data Archive, Cologne.
ZA7572 Data file Version 1.0.0, &lt;a href="https://doi.org/10.4232/1.13372" target="_blank" rel="noopener">https://doi.org/10.4232/1.13372&lt;/a>.
&lt;a href="https://doi.org/10.4232/1.13372" target="_blank" rel="noopener">https://doi.org/10.4232/1.13372&lt;/a>.&lt;/p></description></item></channel></rss>