IRII

Institute of Robotics and Industrial Informatics

corporate_fareorganization

Web page: https://www.iri.upc.edu/

Country: Spain

Wikidata: Q3815300

ISNI: 0000000417632928

ROR: https://ror.org/00mcdyn90

IRII

- Funding / Projects
  (4)

4 Projects, page 1 of 1

IPALM (Interactive Perception-Action-Learning for Modelling Objects)
assignment_turned_inProject2019 - 2022Partners:Imperial College London, UPC, IRII, Universite de Bordeaux, Aalto University
Imperial College London,
UPC,
IRII,
Universite de Bordeaux,
Aalto University
Funder: CHIST-ERA Project Code: CHIST-ERA-17-ORMR-005
Manipulating everyday objects without detailed prior models is still beyond the capabilities of existing robots. This is due to many challenges posed by diverse types of objects: Manipulation requires understanding and accurate model of physical properties of objects such as shape, mass, friction, elasticity, etc. Many objects are deformable, articulated, or even organic with undefined shape (e.g., plants) such that a fixed model is insufficient. On top of this, objects may be difficult to perceive, typically because of cluttered scenarios, or complex lighting and reflectance properties such as specularity or partial transparency. Creating such rich representations of objects is beyond current datasets and benchmarking practices used for grasping and manipulation. In this project we will develop an automated interactive perception pipeline for building such rich digitization. More specifically, in IPALM, we will develop methods for the automatic digitization of objects and their physical properties by exploratory manipulations. These methods will be used to build a large collection of object models required for realistic grasping and manipulation experiments in robotics. Household objects such as tools, kitchenware, clothes, and food items are not only widely accessible and in focus of many practical applications but also pose great challenges for robot object perception and manipulation in realistic scenarios. We propose to advance the state of the art by including household objects that can be deformable, articulated, interactive, specular or transparent, as well as shapeless such as cloth and food items. Our methods will learn physical properties essential for perception and grasping simultaneously from different modalities: vision, touch, audio as well as text documents such as online manuals and will include the following properties: 3D model, texture, elasticity, friction, weight, size and grasping techniques for intended use. At the core of our approach is a two-level modeling, where a category level model provides priors for capturing instance level attributes of specific objects. We will exploit online available resources to build prior category level models and a perception-action-learning loop will use the robot’s vision, audio, and touch to model instance level object properties. In return, knowledge acquired from a new instance will be used to improve the category-level knowledge. Our approach will allow us to efficiently create a large database of models for objects of diverse types, which will be suitable for example for training neural network based methods or enhancing existing simulators. We will propose a benchmark and evaluation metrics for object grasping, to enable comparisons of results generated with various robotics platforms on our database. The main objectives we pursue are commercially relevant robotics technologies, as endorsed by the support letters of several companies. We will pursue our goals with a consortium that brings together 5 world-class academic institutions from 5 EU countries (Imperial College London (UK), University of Bordeaux (France), Institut de Robòtica i informàtica Industrial (Spain), Aalto University (Finland), and the Czech Technical University (Czech Republic), assembling a complementary research team with strong expertise in the acquisition, processing and learning of multimodal information with applications in robotics.
linkLink to
shareshareShare
uploaduploadDeposit
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
more_vert
linkLink to
shareshareShare
uploaduploadDeposit
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
ViSen (Visual Sense)
assignment_turned_inProject2012 - 2015Partners:University of Sheffield, Department of Computer Science, University of Surrey/Department of Electronic Engineering, IRII, UPC, Ecole Centrale de Lyon
University of Sheffield, Department of Computer Science,
University of Surrey/Department of Electronic Engineering,
IRII,
UPC,
Ecole Centrale de Lyon
Funder: CHIST-ERA Project Code: ViSen
Today a typical Web document will contain a mix of visual and textual content. Most traditional tools for search and retrieval can successfully handle textual content, but are not prepared to handle hetereogeneous documents. The new type of content demands the development of new efficient tools for search and retrieval. The visual sense project aims at mining automatically the semantic content of visual data to enable “machine reading” of images. In recent years, we have witnessed significant advances in the automatic recognition of visual concepts (VCR). These advances allowed for the creation of systems that can automatically generate keyword-based image annotations. The goal of this project is to move a step forward and predict semantic image representations that can be used to generate more informative sentence-based image annotations. Thus, facilitating search and browsing of large multi-modal collections. More specifically, the project targets three case studies, namely image annotation, re-ranking for image search, and automatic image illustration of articles. It will address the following key open research challenges: To develop methods that can predict a semantic representation of visual content. This representation will go beyond the detection of objects and scenes and will also recognize a wide range of object relations. To extend state-of-the-art natural language techniques to the tasks of mining large collections of multi-modal documents and generating image captions using both semantic representations of visual content and object/scene type models derived from semantic representations of the multi-modal documents. To develop learning algorithms that can exploit available multi-modal data to discover mappings between visual and textual content. These algorithms should be able to leverage ‘weakly’ annotated data and be robust to large amounts of noise. For this purpose, the current project will build on expertise from multiple disciplines, including computer vision, machine learning and natural language processing (NLP), and gathers four research groups from University of Surrey (Surrey, UK), Institut de Robòtica i Informàtica Industrial (IRI, Spain) , Ecole Centrale de Lyon (ECL, France), and University of Sheffield (Sheffield, UK) having each well established and complementary expertise in their respective areas of research.
linkLink to
shareshareShare
uploaduploadDeposit
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
more_vert
linkLink to
shareshareShare
uploaduploadDeposit
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
IPALM (Interactive Perception-Action-Learning for Modelling Objects)
assignment_turned_inProjectFrom 2019Partners:Laboratoire Bordelais de Recherche en Informatique, Imperial College London, UPC, ČVUT, IRII +2 partners
Laboratoire Bordelais de Recherche en Informatique,
Imperial College London,
UPC,
ČVUT,
IRII,
Aalto University,
ENPC
Funder: French National Research Agency (ANR) Project Code: ANR-18-CHR3-0005
Funder Contribution: 239,630 EUR
Manipulating everyday objects without detailed prior models is still beyond the capabilities of existing robots. This is due to many challenges posed by diverse types of objects: Manipulation requires understanding and accurate model of physical properties of objects such as shape, mass, friction, elasticity, etc. Many objects are deformable, articulated, or even organic with undefined shape (e.g., plants) such that a fixed model is insufficient. On top of this, objects may be difficult to perceive, typically because of cluttered scenarios, or complex lighting and reflectance properties such as specularity or partial transparency. Creating such rich representations of objects is beyond current datasets and benchmarking practices used for grasping and manipulation. In this project we will develop an automated interactive perception pipeline for building such rich digitization. More specifically, in IPALM, we will develop methods for the automatic digitization of objects and their physical properties by exploratory manipulations. These methods will be used to build a large collection of object models required for realistic grasping and manipulation experiments in robotics. Household objects such as tools, kitchenware, clothes, and food items are not only widely accessible and in focus of many practical applications but also pose great challenges for robot object perception and manipulation in realistic scenarios. We propose to advance the state of the art by including household objects that can be deformable, articulated, interactive, specular or transparent, as well as shapeless such as cloth and food items. Our methods will learn physical properties essential for perception and grasping simultaneously from different modalities: vision, touch, audio as well as text documents such as online manuals and will include the following properties: 3D model, texture, elasticity, friction, weight, size and grasping techniques for intended use. At the core of our approach is a two-level modeling, where a category level model provides priors for capturing instance level attributes of specific objects. We will exploit online available resources to build prior category level models and a perception-action-learning loop will use the robot’s vision, audio, and touch to model instance level object properties. In return, knowledge acquired from a new instance will be used to improve the category-level knowledge. Our approach will allow us to efficiently create a large database of models for objects of diverse types, which will be suitable for example for training neural network based methods or enhancing existing simulators. We will propose a benchmark and evaluation metrics for object grasping, to enable comparisons of results generated with various robotics platforms on our database. The main objectives we pursue are commercially relevant robotics technologies, as endorsed by the support letters of several companies. We will pursue our goals with a consortium that brings together 5 world-class academic institutions from 5 EU countries (Imperial College London (UK), University of Bordeaux (France), Institut de Robòtica i informàtica Industrial (Spain), Aalto University (Finland), and the Czech Technical University (Czech Republic), assembling a complementary research team with strong expertise in the acquisition, processing and learning of multimodal information with applications in robotics.
linkLink to
shareshareShare
uploaduploadDeposit
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
more_vert
linkLink to
shareshareShare
uploaduploadDeposit
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
ViSen
assignment_turned_inProjectFrom 2013Partners:UPC, Institut National des Sciences Appliquées de Lyon - Laboratoire dIngénierie des Matériaux Polymères, University of Sheffield, Department of Computer Science, University of Surrey/Department of Electronic Engineering, False +2 partners
UPC,
Institut National des Sciences Appliquées de Lyon - Laboratoire dIngénierie des Matériaux Polymères,
University of Sheffield, Department of Computer Science,
University of Surrey/Department of Electronic Engineering,
False,
IRII,
Institut de Robòtica i Informàtica Industrial
Funder: French National Research Agency (ANR) Project Code: ANR-12-CHRI-0002
Funder Contribution: 296,475 EUR
Today a typical Web document will contain a mix of visual and textual content. Most traditional tools for search and retrieval can successfully handle textual content, but are not prepared to handle hetereogeneous documents. The new type of content demands the development of new efficient tools for search and retrieval. The visual sense project aims at mining automatically the semantic content of visual data to enable “machine reading” of images. In recent years, we have witnessed significant advances in the automatic recognition of visual concepts (VCR). These advances allowed for the creation of systems that can automatically generate keyword-based image annotations. The goal of this project is to move a step forward and predict semantic image representations that can be used to generate more informative sentence-based image annotations. Thus, facilitating search and browsing of large multi-modal collections. More specifically, the project targets three case studies, namely image annotation, re-ranking for image search, and automatic image illustration of articles. It will address the following key open research challenges: 1. To develop methods that can predict a semantic representation of visual content. This representation will go beyond the detection of objects and scenes and will also recognize a wide range of object relations. 2. To extend state-of-the-art natural language techniques to the tasks of mining large collections of multi-modal documents and generating image captions using both semantic representations of visual content and object/scene type models derived from semantic representations of the multi-modal documents. 3. To develop learning algorithms that can exploit available multi-modal data to discover mappings between visual and textual content. These algorithms should be able to leverage ‘weakly’ annotated data and be robust to large amounts of noise. For this purpose, the current project will build on expertise from multiple disciplines, including computer vision, machine learning and natural language processing (NLP), and gathers four research groups from University of Surrey (Surrey, UK), Institut de Robòtica i Informàtica Industrial (IRI, Spain) , Ecole Centrale de Lyon (ECL, France), and University of Sheffield (Sheffield, UK) having each well established and complementary expertise in their respective areas of research.
linkLink to
shareshareShare
uploaduploadDeposit
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
more_vert
linkLink to
shareshareShare
uploaduploadDeposit
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu

IRII

IRII

Funder

4 Projects, page 1 of 1

IPALM (Interactive Perception-Action-Learning for Modelling Objects)

ViSen (Visual Sense)

IPALM (Interactive Perception-Action-Learning for Modelling Objects)

ViSen

Loading