Powered by OpenAIRE graph
Found an issue? Give us feedback

DAPHNE

Integrated Data Analysis Pipelines for Large-Scale Data Management, HPC, and Machine Learning
Funder: European CommissionProject code: 957407 Call for proposal: H2020-ICT-2020-1
Funded under: H2020 | RIA Overall Budget: 6,609,660 EURFunder Contribution: 6,609,660 EUR
Description

Modern data-driven applications leverage large, heterogeneous data collections to find interesting patterns, and build robust machine learning (ML) models for accurate predictions. Large data sizes and advanced analytics spurred the development and adoption of data-parallel computation frameworks like Apache Spark or Flink as well as distributed ML systems like MLlib, TensorFlow, or PyTorch. A key observation is that these new systems share many techniques with traditional high-performance computing (HPC), and the architecture of underlying HW clusters converges. Yet, the programming paradigms, cluster resource management, as well as data formats and representations differ substantially across data management, HPC, and ML software stacks. There is a trend though, toward complex data analysis pipelines that combine these different systems. Examples are workflows of distributed data pre-processing, tuned HPC libraries, and dedicated ML systems, but also HPC applications that leverage ML models for more cost-eff

Data Management Plans
Powered by OpenAIRE graph
Found an issue? Give us feedback

Do the share buttons not appear? Please make sure, any blocking addon is disabled, and then reload the page.

All Research products
arrow_drop_down
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://beta.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=corda__h2020::e501fe8b950a6b01e49ec2e39cea1718&type=result"></script>');
-->
</script>
For further information contact us at helpdesk@openaire.eu

No option selected
arrow_drop_down