<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>datathon | Reprex</title><link>https://reprex-next.netlify.app/tag/datathon/</link><atom:link href="https://reprex-next.netlify.app/tag/datathon/index.xml" rel="self" type="application/rss+xml"/><description>datathon</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Thu, 03 Jun 2021 16:00:00 +0000</lastBuildDate><image><url>https://reprex-next.netlify.app/media/icon_hub9491570ac57158c0eeecc95c95b13e5_20247_512x512_fill_lanczos_center_3.png</url><title>datathon</title><link>https://reprex-next.netlify.app/tag/datathon/</link></image><item><title>Economic and Environment Impact Analysis, Automated for Data-as-Service</title><link>https://reprex-next.netlify.app/post/2021-06-03-iotables-release/</link><pubDate>Thu, 03 Jun 2021 16:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-06-03-iotables-release/</guid><description>&lt;p>We have released a new version of
&lt;a href="https://iotables.dataobservatory.eu/" target="_blank" rel="noopener">iotables&lt;/a> as part of the
&lt;a href="http://ropengov.org/" target="_blank" rel="noopener">rOpenGov&lt;/a> project. The package, as the name
suggests, works with European symmetric input-output tables (SIOTs).
SIOTs are among the most complex governmental statistical products. They
show how each country’s 64 agricultural, industrial, service, and
sometimes household sectors relate to each other. They are estimated
from various components of the GDP, tax collection, at least every five
years.&lt;/p>
&lt;p>SIOTs offer great value to policy-makers and analysts to make more than
educated guesses on how a million euros, pounds or Czech korunas spent
on a certain sector will impact other sectors of the economy, employment
or GDP. What happens when a bank starts to give new loans and advertise
them? How is an increase in economic activity going to affect the amount
of wages paid and and where will consumers most likely spend their
wages? As the national economies begin to reopen after COVID-19 pandemic
lockdowns, is to utilize SIOTs to calculate direct and indirect
employment effects or value added effects of government grant programs
to sectors such as cultural and creative industries or actors such as
venues for performing arts, movie theaters, bars and restaurants.&lt;/p>
&lt;p>Making such calculations requires a bit of matrix algebra, and
understanding of input-output economics, direct, indirect effects, and
multipliers. Economists, grant designers, policy makers have those
skills, but until now, such calculations were either made in cumbersome
Excel sheets, or proprietary software, as the key to these calculations
is to keep vectors and matrices, which have at least one dimension of
64, perfectly aligned. We made this process reproducible with
&lt;a href="https://iotables.dataobservatory.eu/" target="_blank" rel="noopener">iotables&lt;/a> and
&lt;a href="https://CRAN.R-project.org/package=eurostat" target="_blank" rel="noopener">eurostat&lt;/a> on
&lt;a href="http://ropengov.org/" target="_blank" rel="noopener">rOpenGov&lt;/a>&lt;/p>
&lt;figure id="figure-our-iotables-package-creates-direct-indirect-effects-and-multipliers-programatically-our-observatory-will-make-those-indicators-available-for-all-european-countries">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/media/img/package_screenshots/iotables_0_4_5.png" alt="Our iotables package creates direct, indirect effects and multipliers programatically. Our observatory will make those indicators available for all European countries." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Our iotables package creates direct, indirect effects and multipliers programatically. Our observatory will make those indicators available for all European countries.
&lt;/figcaption>&lt;/figure>
&lt;h2 id="accessing-and-tidying-the-data-programmatically">Accessing and tidying the data programmatically&lt;/h2>
&lt;p>The iotables package is in a way an extension to the &lt;em>eurostat&lt;/em> R
package, which provides a programmatic access to the
&lt;a href="https://ec.europa.eu/eurostat" target="_blank" rel="noopener">Eurostat&lt;/a> data warehouse. The reason for
releasing a new package is that working with SIOTs requires plenty of
meticulous data wrangling based on various &lt;em>metadata&lt;/em> sources, apart
from actually accessing the &lt;em>data&lt;/em> itself. When working with matrix
equations, the bar is higher than with tidy data. Not only your rows and
columns must match, but their ordering must strictly conform the
quadrants of the a matrix system, including the connecting trade or tax
matrices.&lt;/p>
&lt;p>When you download a country’s SIOT table, you receive a long form data
frame, a very-very long one, which contains the matrix values and their
labels like this:&lt;/p>
&lt;pre>&lt;code>## Table naio_10_cp1700 cached at C:\Users\...\Temp\RtmpGQF4gr/eurostat/naio_10_cp1700_date_code_FF.rds
# we save it for further reference here
saveRDS(naio_10_cp1700, &amp;quot;not_included/naio_10_cp1700_date_code_FF.rds&amp;quot;)
# should you need to retrieve the large tempfiles, they are in
dir (file.path(tempdir(), &amp;quot;eurostat&amp;quot;))
dplyr::slice_head(naio_10_cp1700, n = 5)
## # A tibble: 5 x 7
## unit stk_flow induse prod_na geo time values
## &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt; &amp;lt;date&amp;gt; &amp;lt;dbl&amp;gt;
## 1 MIO_EUR DOM CPA_A01 B1G EA19 2019-01-01 141873.
## 2 MIO_EUR DOM CPA_A01 B1G EU27_2020 2019-01-01 174976.
## 3 MIO_EUR DOM CPA_A01 B1G EU28 2019-01-01 187814.
## 4 MIO_EUR DOM CPA_A01 B2A3G EA19 2019-01-01 0
## 5 MIO_EUR DOM CPA_A01 B2A3G EU27_2020 2019-01-01 0
&lt;/code>&lt;/pre>
&lt;p>The metadata reads like this: the units are in millions of euros, we are
analyzing domestic flows, and the national account items &lt;code>B1-B2&lt;/code> for the
industry &lt;code>A01&lt;/code>. The information of a 64x64 matrix (the SIOT) and its
connecting matrices, such as taxes, or employment, or &lt;em>C**O&lt;/em>&lt;sub>2&lt;/sub>
emissions, must be placed exactly in one correct ordering of columns and
rows. Every single data wrangling error will usually lead in an error
(the matrix equation has no solution), or, what is worse, in a very
difficult to trace algebraic error. Our package not only labels this
data meaningfully, but creates very tidy data frames that contain each
necessary matrix of vector with a key column.&lt;/p>
&lt;p>iotables package contains the vocabularies (abbreviations and human
readable labels) of three statistical vocabularies: the so called
&lt;code>COICOP&lt;/code> product codes, the &lt;code>NACE&lt;/code> industry codes, and the vocabulary of
the &lt;code>ESA2010&lt;/code> definition of national accounts (which is the government
equivalent of corporate accounting).&lt;/p>
&lt;p>Our package currently solves all equations for direct, indirect effects,
multipliers and inter-industry linkages. Backward linkages show what
happens with the suppliers of an industry, such as catering or
advertising in the case of music festivals, if the festivals reopen. The
forward linkages show how much extra demand this creates for connecting
services that treat festivals as a ‘supplier’, such as cultural tourism.&lt;/p>
&lt;h2 id="lets-seen-an-example">Let’s seen an example&lt;/h2>
&lt;pre>&lt;code>## Downloading employment data from the Eurostat database.
## Table lfsq_egan22d cached at C:\Users\...\Temp\RtmpGQF4gr/eurostat/lfsq_egan22d_date_code_FF.rds
&lt;/code>&lt;/pre>
&lt;p>and match it with the latest structural information on from the
&lt;a href="http://appsso.eurostat.ec.europa.eu/nui/show.do?wai=true&amp;amp;dataset=naio_10_cp1700" target="_blank" rel="noopener">Symmetric input-output table at basic prices (product by
product)&lt;/a>
Eurostat product. A quick look at the Eurostat website already shows
that there is a lot of work ahead to make the data look like an actual
Symmetric input-output table. Download it with &lt;code>iotable_get()&lt;/code> which
does basic labelling and preprocessing on the raw Eurostat files.
Because of the size of the unfiltered dataset on Eurostat, the following
code may take several minutes to run.&lt;/p>
&lt;pre>&lt;code>sk_io &amp;lt;- iotable_get ( labelled_io_data = NULL,
source = &amp;quot;naio_10_cp1700&amp;quot;, geo = &amp;quot;SK&amp;quot;,
year = 2015, unit = &amp;quot;MIO_EUR&amp;quot;,
stk_flow = &amp;quot;TOTAL&amp;quot;,
labelling = &amp;quot;iotables&amp;quot; )
## Reading cache file C:\Users\..\Temp\RtmpGQF4gr/eurostat/naio_10_cp1700_date_code_FF.rds
## Table naio_10_cp1700 read from cache file: C:\Users\..\Temp\RtmpGQF4gr/eurostat/naio_10_cp1700_date_code_FF.rds
## Saving 808 input-output tables into the temporary directory
## C:\Users\...\Temp\RtmpGQF4gr
## Saved the raw data of this table type in temporary directory C:\Users\...\Temp\RtmpGQF4gr/naio_10_cp1700.rds.
&lt;/code>&lt;/pre>
&lt;p>The &lt;code>input_coefficient_matrix_create()&lt;/code> creates the input coefficient
matrix, which is used for most of the analytical functions.&lt;/p>
&lt;p>&lt;em>a&lt;/em>&lt;sub>&lt;em>i**j&lt;/em>&lt;/sub> = &lt;em>X&lt;/em>&lt;sub>&lt;em>i**j&lt;/em>&lt;/sub> / &lt;em>x&lt;/em>&lt;sub>&lt;em>j&lt;/em>&lt;/sub>&lt;/p>
&lt;p>It checks the correct ordering of columns, and furthermore it fills up 0
values with 0.000001 to avoid division with zero.&lt;/p>
&lt;pre>&lt;code>input_coeff_matrix_sk &amp;lt;- input_coefficient_matrix_create(
data_table = sk_io
)
## Columns and rows of real_estate_imputed_a, extraterriorial_organizations are all zeros and will be removed.
&lt;/code>&lt;/pre>
&lt;p>Then you can create the Leontieff-inverse, which contains all the
structural information about the relationships of 64x64 sectors of the
chosen country, in this case, Slovakia, ready for the main equations of
input-output economics.&lt;/p>
&lt;pre>&lt;code>I_sk &amp;lt;- leontieff_inverse_create(input_coeff_matrix_sk)
&lt;/code>&lt;/pre>
&lt;p>And take out the primary inputs:&lt;/p>
&lt;pre>&lt;code>primary_inputs_sk &amp;lt;- coefficient_matrix_create(
data_table = sk_io,
total = 'output',
return = 'primary_inputs')
## Columns and rows of real_estate_imputed_a, extraterriorial_organizations are all zeros and will be removed.
&lt;/code>&lt;/pre>
&lt;p>Now let’s see if there the government tries to stimulate the economy in
three sectors, agricultulre, car manufacturing, and R&amp;amp;D with a billion
euros. Direct effects measure the initial, direct impact of the change
in demand and supply for a product. When production goes up, it will
create demand in all supply industries (backward linkages) and create
opportunities in the industries that use the product themselves (forward
linkages.)&lt;/p>
&lt;pre>&lt;code>direct_effects_create( primary_inputs_sk, I_sk ) %&amp;gt;%
select ( all_of(c(&amp;quot;iotables_row&amp;quot;, &amp;quot;agriculture&amp;quot;,
&amp;quot;motor_vechicles&amp;quot;, &amp;quot;research_development&amp;quot;))) %&amp;gt;%
filter (.data$iotables_row %in% c(&amp;quot;gva_effect&amp;quot;, &amp;quot;wages_salaries_effect&amp;quot;,
&amp;quot;imports_effect&amp;quot;, &amp;quot;output_effect&amp;quot;))
## iotables_row agriculture motor_vechicles research_development
## 1 imports_effect 1.3684350 2.3028203 0.9764921
## 2 wages_salaries_effect 0.2713804 0.3183523 0.3828014
## 3 gva_effect 0.9669621 0.9790771 0.9669467
## 4 output_effect 2.2876287 3.9840251 2.2579634
&lt;/code>&lt;/pre>
&lt;p>Car manufacturing requires much imported components, so each extra
demand will create a large importing activity. The R&amp;amp;D will create a the
most local wages (and supports most jobs) because research is
job-intensive. As we can see, the effect on imports, wages, gross value
added (which will end up in the GDP) and output changes are very
different in these three sectors.&lt;/p>
&lt;p>This is not the total effect, because some of the increased production
will translate into income, which in turn will be used to create further
demand in all parts of the domestic economy. The total effect is
characterized by multipliers.&lt;/p>
&lt;p>Then solve for the multipliers:&lt;/p>
&lt;pre>&lt;code>multipliers_sk &amp;lt;- input_multipliers_create(
primary_inputs_sk %&amp;gt;%
filter (.data$iotables_row == &amp;quot;gva&amp;quot;), I_sk )
&lt;/code>&lt;/pre>
&lt;p>And select a few industries:&lt;/p>
&lt;pre>&lt;code>set.seed(12)
multipliers_sk %&amp;gt;%
tidyr::pivot_longer ( -all_of(&amp;quot;iotables_row&amp;quot;),
names_to = &amp;quot;industry&amp;quot;,
values_to = &amp;quot;GVA_multiplier&amp;quot;) %&amp;gt;%
select (-all_of(&amp;quot;iotables_row&amp;quot;)) %&amp;gt;%
arrange( -.data$GVA_multiplier) %&amp;gt;%
dplyr::sample_n(8)
## # A tibble: 8 x 2
## industry GVA_multiplier
## &amp;lt;chr&amp;gt; &amp;lt;dbl&amp;gt;
## 1 motor_vechicles 7.81
## 2 wood_products 2.27
## 3 mineral_products 2.83
## 4 human_health 1.53
## 5 post_courier 2.23
## 6 sewage 1.82
## 7 basic_metals 4.16
## 8 real_estate_services_b 1.48
&lt;/code>&lt;/pre>
&lt;h2 id="vignettes">Vignettes&lt;/h2>
&lt;p>The &lt;a href="https://iotables.dataobservatory.eu/articles/germany_1990.html" target="_blank" rel="noopener">Germany
1990&lt;/a>
provides an introduction of input-output economics and re-creates the
examples of the &lt;a href="https://iotables.dataobservatory.eu/articles/germany_1990.html" target="_blank" rel="noopener">Eurostat Manual of Supply, Use and Input-Output
Tables&lt;/a>,
by Jörg Beutel (Eurostat Manual).&lt;/p>
&lt;p>The &lt;a href="https://iotables.dataobservatory.eu/articles/united_kingdom_2010.html" target="_blank" rel="noopener">United Kingdom Input-Output Analytical Tables Daniel Antal, based
on the work edited by Richard
Wild&lt;/a>
is a use case on how to correctly import data from outside Eurostat
(i.e. not with &lt;code>eurostat::get_eurostat()&lt;/code>) and join it properly to a
SIOT. We also used this example to create unit tests of our functions
from a published, official government statistical release.&lt;/p>
&lt;p>Finally, &lt;a href="https://iotables.dataobservatory.eu/articles/working_with_eurostat.html" target="_blank" rel="noopener">Working With Eurostat
Data&lt;/a>
is a detailed use case of working with all the current functionalities
of the package by comparing two economies, Czechia and Slovakia and
guides you through a lot more examples than this short blogpost.&lt;/p>
&lt;p>Our package was originally developed to calculate GVA and employment
effects for the Slovak music industry (see our &lt;a href="https://music.dataobservatory.eu/publication/slovak_music_industry_2019/" target="_blank" rel="noopener">Slovak Music Industry Report&lt;/a>), and similar calculations for the
Hungarian film tax shelter. We can now programatically create
reproducible multipliers for all European economies in the &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital
Music Observatory&lt;/a>, and create
further indicators for economic policy making in the &lt;a href="https://economy.dataobservatory.eu/" target="_blank" rel="noopener">Economy Data
Observatory&lt;/a>.&lt;/p>
&lt;h2 id="environmental-impact-analysis">Environmental Impact Analysis&lt;/h2>
&lt;p>Our package allows the calculation of various economic policy scenarios,
such as changing the VAT on meat or effects of re-opening music
festivals on aggregate demand, GDP, tax revenues, or employment. But
what about the &lt;em>C**O&lt;/em>&lt;sub>2&lt;/sub>, methane and other greenhouse gas
effects of the reopening festivals, or the increasing meat prices?&lt;/p>
&lt;p>Technically our package can already calculate such effects, but to do
so, you have to carefully match further statistical vocabulary items
used by the European Environmental Agency about air pollutants and
greenhouse gases.&lt;/p>
&lt;p>The last released version of &lt;em>iotables&lt;/em> is Importing and Manipulating
Symmetric Input-Output Tables (Version 0.4.4). Zenodo.
&lt;a href="https://zenodo.org/record/4897472" target="_blank" rel="noopener">https://doi.org/10.5281/zenodo.4897472&lt;/a>,
but we are alread working on a new major release. In that release, we
are planning to build in the necessary vocabulary into the metadata
functions to increase the functionality of the package, and create new
indicators for our &lt;a href="https://greendeal.dataobservatory.eu/" target="_blank" rel="noopener">Green Deal Data
Observatory&lt;/a>. This experimental
data observatory is creating new, high quality statistical indicators
from open governmental and open science data sources that has not seen
the daylight yet.&lt;/p>
&lt;h2 id="ropengov-and-the-eu-datathon-challenges">rOpenGov and the EU Datathon Challenges&lt;/h2>
&lt;figure id="figure-ropengov-reprex-and-other-open-collaboration-partners-teamed-up-to-build-on-our-expertise-of-open-source-statistical-software-development-further-we-want-to-create-a-technologically-and-financially-feasible-data-as-service-to-put-our-reproducible-research-products-into-wider-user-for-the-business-analyst-scientific-researcher-and-evidence-based-policy-design-communities">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/media/img/partners/rOpenGov-intro.png" alt="rOpenGov, Reprex, and other open collaboration partners teamed up to build on our expertise of open source statistical software development further: we want to create a technologically and financially feasible data-as-service to put our reproducible research products into wider user for the business analyst, scientific researcher and evidence-based policy design communities." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
rOpenGov, Reprex, and other open collaboration partners teamed up to build on our expertise of open source statistical software development further: we want to create a technologically and financially feasible data-as-service to put our reproducible research products into wider user for the business analyst, scientific researcher and evidence-based policy design communities.
&lt;/figcaption>&lt;/figure>
&lt;p>&lt;a href="http://ropengov.org/" target="_blank" rel="noopener">rOpenGov&lt;/a> is a community of open governmental
data and statistics developers with many packages that make programmatic
access and work with open data possible in the R language.
&lt;a href="https://reprex.nl/" target="_blank" rel="noopener">Reprex&lt;/a> is a Dutch-startup that teamed up with
rOpenGov and other open collaboration partners to create a
technologically and financially feasible service to exploit reproducible
research products for the wider business, scientific and evidence-based
policy design community. Open data is a legal concept - it means that
you have the rigth to reuse the data, but often the reuse requires
significant programming and statistical know-how. We entered into the
annual &lt;a href="https://reprex.nl/project/eu-datathon_2021/" target="_blank" rel="noopener">EU Datathon&lt;/a>
competition in all three challenges with our applications to not only
provide open-source software, but daily updated, validated, documented,
high-quality statistical indicators as open data in an open database.
Our &lt;a href="https://iotables.dataobservatory.eu/" target="_blank" rel="noopener">iotables&lt;/a> package is one of
our many open-source building blocks to make open data more accessible
to all.&lt;/p>
&lt;p>&lt;em>Join our open collaboration Digital Music Observatory team as a &lt;a href="https://music.dataobservatory.eu/authors/curator" target="_blank" rel="noopener">data curator&lt;/a>, &lt;a href="https://music.dataobservatory.eu/authors/developer" target="_blank" rel="noopener">developer&lt;/a> or &lt;a href="https://music.dataobservatory.eu/authors/team" target="_blank" rel="noopener">business developer&lt;/a>. More interested in environmental impact analysis? Try our &lt;a href="https://greendeal.dataobservatory.eu/#contributors" target="_blank" rel="noopener">Green Deal Data Observatory&lt;/a> team! Or economic policies, particularly computation antitrust, innovation and small enterprises? Check out our &lt;a href="https://economy.dataobservatory.eu/#contributors" target="_blank" rel="noopener">Economy Music Observatory&lt;/a> team!&lt;/em>&lt;/p></description></item><item><title>Reprex is Contesting all Three Challenges of the EU Datathon 2021 Prize</title><link>https://reprex-next.netlify.app/post/2021-05-21-eu-datathon-2021/</link><pubDate>Fri, 21 May 2021 20:00:00 +0000</pubDate><guid>https://reprex-next.netlify.app/post/2021-05-21-eu-datathon-2021/</guid><description>&lt;p>Reprex, a Dutch start-up enterprise formed to utilize open source software and open data, is looking for partners in an agile, open collaboration to win at least one of the three EU Datathon Prizes. We are looking for policy partners, academic partners and a consultancy partner. Our project is based on agile, open collaboration with three types of contributors.&lt;/p>
&lt;p>With our competing prototypes we want to show that we have a research automation technology that can find open data, process it and validate it into high-quality business, policy or scientific indicators, and release it with daily refreshments in a modern API.&lt;/p>
&lt;p>We are looking for institutions to challenge us with their data problems, and sponsors to increase our capacity. Over then next 5 months, we need to find a sustainable business model for a high-quality and open alternative to other public data programs.&lt;/p>
&lt;h2 id="the-eu-datathon-2021-challenge">The EU Datathon 2021 Challenge&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>&lt;em>To take part, you should propose the development of an application that links and uses open datasets.&lt;/em> - our &lt;a href="https://music.dataobservatory.eu/#contributors" target="_blank" rel="noopener">data curator team&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;em>Your application &amp;hellip; is also expected to find suitable new approaches and solutions to help Europe achieve important goals set by the European Commission through the use of open data.&lt;/em>” - this application is developed by our &lt;a href="https://greendeal.dataobservatory.eu/#contributors" target="_blank" rel="noopener">technology contributors&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;em>Your application should showcase opportunities for concrete business models or social enterprises.&lt;/em> - our &lt;a href="https://economy.dataobservatory.eu/#contributors" target="_blank" rel="noopener">service development team&lt;/a> is working to make this happen!&lt;/p>
&lt;/li>
&lt;li>
&lt;p>We use open source software and open data. The applications are hosted on the cloud resources of &lt;a href="#reprex">Reprex&lt;/a>, an early-stage technology startup currently building a viable, open-source, open-data business model to create reproducible research products.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>We are working together with experts in the domain as curators (check out our guidelines if you want to join: &lt;a href="https://curators.dataobservatory.eu/data-curators.html" target="_blank" rel="noopener">Data Curators: Get Inspired!&lt;/a>).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Our development team works on an open collaboration basis. Our indicator R packages, and our services are developed together with &lt;a href="https://music.dataobservatory.eu/author/ropengov/" target="_blank" rel="noopener">rOpenGov&lt;/a>.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="mission-statement">Mission statement&lt;/h2>
&lt;p>We want to win an &lt;a href="https://op.europa.eu/en/web/eudatathon" target="_blank" rel="noopener">EU Datathon prize&lt;/a> by processing the vast, already-available governmental and scientific open data made usable for policy-makers, scientific researchers, and business researcher end-users.&lt;/p>
&lt;p>“&lt;em>To take part, you should propose the development of an application that links and uses open datasets. Your application should showcase opportunities for concrete business models or social enterprises. It is also expected to find suitable new approaches and solutions to help Europe achieve important goals set by the European Commission through the use of open data.&lt;/em>”&lt;/p>
&lt;p>We aim to win at least one first prize in the EU Datathon 2021. We are contesting &lt;strong>all three&lt;/strong> challenges, which are related to the EU’s official strategic policies for the coming decade.&lt;/p>
&lt;h2 id="challenge-1-a-european-grean-deel">Challenge 1: A European Grean Deel&lt;/h2>
&lt;figure id="figure-our-green-deal-data-observatory-connects-socio-economic-and-environmental-data-to-help-understanding-and-combating-climate-change">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/media/img/observatory_screenshots/GD_Observatory_opening_page.png" alt="Our Green Deal Data Observatory connects socio-economic and environmental data to help understanding and combating climate change." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Our Green Deal Data Observatory connects socio-economic and environmental data to help understanding and combating climate change.
&lt;/figcaption>&lt;/figure>
&lt;p>Challenge 1: &lt;a href="https://ec.europa.eu/info/strategy/priorities-2019-2024/european-green-deal_en" target="_blank" rel="noopener">A European Green Deal&lt;/a>, with a particular focus on the &lt;a href="https://ec.europa.eu/commission/presscorner/detail/en/ip_20_2323" target="_blank" rel="noopener">The European Climate Pact&lt;/a>, the &lt;a href="https://ec.europa.eu/info/food-farming-fisheries/farming/organic-farming/organic-action-plan_en" target="_blank" rel="noopener">Organic Action Plan&lt;/a>, and the &lt;a href="https://ec.europa.eu/commission/presscorner/detail/en/IP_21_111" target="_blank" rel="noopener">New European Bauhaus&lt;/a>, i.e., mitigation strategies.&lt;/p>
&lt;p>Climate change and environmental degradation are an existential threat to Europe and the world. To overcome these challenges, the European Union created the European Green Deal strategic plan, which aims to make the EU’s economy sustainable by turning climate and environmental challenges into opportunities and making the transition just and inclusive for all.&lt;/p>
&lt;p>Our &lt;a href="http://greendeal.dataobservatory.eu/" target="_blank" rel="noopener">Green Deal Data Observatory&lt;/a> is a modern reimagination of existing ‘data observatories’; currently, there are over 70 permanent international data collection and dissemination points. One of our objectives is to understand why the dozens of the EU’s observatories do not use open data and reproducible research. We want to show that open governmental data, open science, and reproducible research can lead to a higher quality and faster data ecosystem that fosters growth for policy, business, and academic data users.&lt;/p>
&lt;p>We provide high quality, tidy data through a modern API which enables data flows between public and proprietary databases. We believe that introducing Open Policy Analysis standards with open data, open-source software, and research automation, can help the Green Deal policymaking process. Our collaboration is open for individuals, citizens scientists, research institutes, NGOS, and companies.&lt;/p>
&lt;h2 id="challenge-2-a-europe-fit-for-the-digital-age">Challenge 2: A Europe fit for the digital age&lt;/h2>
&lt;figure id="figure-our-economy-data-observatory-will-focus-on-competition-small-and-medium-sized-enterprizes-and-robotization">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/media/img/observatory_screenshots/edo_opening_page.jpg" alt="Our Economy Data Observatory will focus on competition, small and medium sized enterprizes and robotization." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Our Economy Data Observatory will focus on competition, small and medium sized enterprizes and robotization.
&lt;/figcaption>&lt;/figure>
&lt;p>Challenge 2: &lt;a href="https://ec.europa.eu/info/strategy/priorities-2019-2024/economy-works-people_en#:~:text=Individuals%20and%20businesses%20in%20the,needs%20of%20the%20EU%27s%20citizens." target="_blank" rel="noopener">An economy that works for people&lt;/a>, with a particular focus on the &lt;a href="https://ec.europa.eu/info/strategy/priorities-2019-2024/economy-works-people/internal-market_en" target="_blank" rel="noopener">Single market strategy&lt;/a>, and particular attention to the strategy’s goals of 1. Modernising our standards system, 2. Consolidating Europe’s intellectual property framework, and 3. Enabling the balanced development of the collaborative economy strategic goals.&lt;/p>
&lt;p>Big data and automation create new inequalities and injustices and have the potential to create a jobless growth economy. Our &lt;a href="https://economy.dataobservatory.eu/" target="_blank" rel="noopener">Economy Data Observatory&lt;/a> is a fully automated, open source, open data observatory that produces new indicators from open data sources and experimental big data sources, with authoritative copies and a modern API.&lt;/p>
&lt;p>Our observatory monitors the European economy to protect consumers and small companies from unfair competition, both from data and knowledge monopolization and robotization. We take a critical Small and Medium-Sized Enterprises (SME)-, intellectual property, and competition policy point of view of automation, robotization, and the AI revolution on the service-oriented European social market economy.&lt;/p>
&lt;p>We would like to create early-warning, risk, economic effect, and impact indicators that can be used in scientific, business, and policy contexts for professionals who are working on re-setting the European economy after a devastating pandemic in the age of AI. We are particularly interested in designing indicators that can be early warnings for killer acquisitions, algorithmic and offline discrimination against consumers based on nationality or place of residence, and signs of undermining key economic and competition policy goals. Our goal is to help small and medium-sized enterprises and start-ups to grow, and to furnish data that encourages the financial sector to provide loans and equity funds for their growth.&lt;/p>
&lt;h2 id="challenge-3-a-europe-fit-for-the-digital-age">Challenge 3: A Europe fit for the digital age&lt;/h2>
&lt;figure id="figure-our-digital-music-observatory-is-not-only-a-demo-of-the-european-music-observatory-but-a-testing-ground-for-data-governance-digital-servcies-act-and-trustworthy-ai-problems">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/media/img/observatory_screenshots/dmo_opening_screen.png" alt="Our Digital Music Observatory is not only a demo of the European Music Observatory, but a testing ground for data governance, Digital Servcies Act, and trustworthy AI problems." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Our Digital Music Observatory is not only a demo of the European Music Observatory, but a testing ground for data governance, Digital Servcies Act, and trustworthy AI problems.
&lt;/figcaption>&lt;/figure>
&lt;p>Challenge 3: &lt;a href="https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fit-digital-age_en" target="_blank" rel="noopener">A Europe fit for the digital age&lt;/a>, with a particular focus &lt;a href="https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fit-digital-age/excellence-trust-artificial-intelligence_en" target="_blank" rel="noopener">Artificial Intelligence&lt;/a>, the &lt;a href="https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fit-digital-age/european-data-strategy_en" target="_blank" rel="noopener">European Data Strategy&lt;/a>, the &lt;a href="https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fit-digital-age/digital-services-act-ensuring-safe-and-accountable-online-environment_en" target="_blank" rel="noopener">Digital Services Act&lt;/a>, &lt;a href="https://digital-strategy.ec.europa.eu/en/policies/digital-skills-and-jobs" target="_blank" rel="noopener">Digital Skills&lt;/a> and &lt;a href="https://digital-strategy.ec.europa.eu/en/policies/connectivity" target="_blank" rel="noopener">Connectivity&lt;/a>.&lt;/p>
&lt;p>The &lt;a href="https://music.dataobservatory.eu/" target="_blank" rel="noopener">Digital Music Observatory&lt;/a> (DMO) is a fully automated, open source, open data observatory that creates public datasets to provide a comprehensive view of the European music industry. It provides high-quality and timely indicators in all four pillars of the planned official European Music Observatory as a modern, open source and largely open data-based, automated, API-supported alternative solution for this planned observatory. The insight and methodologies we are refining in the DMO are applicable and transferable to about 60 other data observatories funded by the EU which do not currently employ governmental or scientific open data.&lt;/p>
&lt;p>Music is one of the most data-driven service industries where most sales are currently executed by AI-driven autonomous systems that influence market shares and intellectual property remuneration. We provide a template that enables making these AI-driven systems accountable and trustworthy, with the goal of re-balancing the legitimate interests of creators, distributors, and consumers. Within Europe, this new balance will be an important use case of the European Data Strategy and the Digital Services Act.&lt;/p>
&lt;p>The DMO is a fully functional service that can serve as a testing ground of the European Data Strategy. It can showcase the ways in which the music industry is affected by the problems that the Digital Services Act and European Trustworthy AI initiatives attempt to regulate. It is being built in open collaboration with national music stakeholders, NGOs, academic institutions, and industry groups.&lt;/p>
&lt;p>Our Product/Market Fit was validated in the world’s 2nd ranked university-backed incubator program, the &lt;a href="https://music.dataobservatory.eu/post/2020-09-25-yesdelft-validation/" target="_blank" rel="noopener">Yes!Delft AI Validation Lab&lt;/a>. We are currently developing this project with the help of the &lt;a href="https://www.jumpmusic.eu/fellow2021/automated-music-observatory/" target="_blank" rel="noopener">JUMP European Music Market Accelerator&lt;/a> program.&lt;/p>
&lt;h2 id="problem-statement">Problem Statement&lt;/h2>
&lt;p>The EU has an 18-year-old open data regime and it makes public taxpayer-funded data in the values of tens of billions of euros per year; the Eurostat program alone handles 20,000 international data products, including at least 5,000 pan-European environmental indicators.&lt;/p>
&lt;p>As open science principles gain increased acceptance, scientific researchers are making hundreds of thousands of valuable datasets public and available for replication every year.&lt;/p>
&lt;p>The EU, the OECD, and UN institutions run around 100 data collection programs, so-called ‘data observatories’ that more or less avoid touching this data, and buy proprietary data instead. Annually, each observatory spends between 50 thousand and 3 million EUR on collecting untidy and proprietary data of inconsistent quality, while never even considering open data.&lt;/p>
&lt;figure id="figure-our-automated-data-observatories-are-modern-reimaginations-of-the-existing-observatories-that-do-not-use-open-data-and-research-automation">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://reprex-next.netlify.app/media/img/observatory_screenshots/observatory_collage_16x9_800.png" alt="Our automated data observatories are modern reimaginations of the existing observatories that do not use open data and research automation." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Our automated data observatories are modern reimaginations of the existing observatories that do not use open data and research automation.
&lt;/figcaption>&lt;/figure>
&lt;p>The problem with the current EU data strategy is that while it produces enormous quantities of valuable open data, in the absence of common basic data science and documentation principles, it seems often cheaper to create new data than to put the existing open data into shape.&lt;/p>
&lt;p>This is an absolute waste of resources and efforts. With a few R packages and our deep understanding of advanced data science techniques, we can create valuable datasets from unprocessed open data. In most domains, we are able to repurpose data originally created for other purposes at a historical cost of several billions of euros, converting these unused data assets into valuable datasets that can replace tens of millions’ worth of proprietary data.&lt;/p>
&lt;p>What we want to achieve with this project – and we believe such an accomplishment would merit one of the first prizes - is to add value to a significant portion of pre-existing EU open data (for example, available on &lt;a href="https://data.europa.eu/data/" target="_blank" rel="noopener">data.europa.eu/data&lt;/a>) by re-processing and integrating them into a modern, tidy database with an API access, and to find a business model that emphasises a triangular use of data in 1. business, 2. science and 3. policy-making. Our mission is to modernize the concept of &lt;code>data observatories.&lt;/code>&lt;/p></description></item></channel></rss>