Véronique Benzaken's Home page
  1. Véronique Benzaken
  2. Full Professor
  3. email: veronique dot benzaken at u-psud dot fr
  4. tel: +33 (0)1 6915 6628
  5. office 72
  1. LRI, UMR 8623 - CNRS- VALS research group
  2. PCRI Bat 650 Ada Lovelace
  3. Université Paris-Sud 11
  4. 91405 Orsay Cedex - France
  5. to reach PCRI [look here/c'est ici]

I am currently a member of the VALS - Verification of Algorithms Languages and Systems research group, joint team between LRI a Laboratory of French National Center for Scientific Research (CNRS) in the Computer Science Department here/ici at Université Paris Sud 11 and the Toccata group more/ici at INRIA - Saclay. Until august 2010, I have been a member of the former database group (RIP) headed by Nicolas Spyratos .

You can consult a very succinct bio here Here is my list of publications. My Google Scholar Home Page is [ici/here]


I am interested in Data-Centric Programming Languages and Systems.Internet explosion and the ever growing importance of data in applications as well as the recent emergence of Cloud computing, has given birth to a whirlwind of new data models (XML, JSON, RDF) and languages (XPath, XQuery, Pig, Jaql, Sparql...). Whether they are developed under the banner of NoSQL (which stands for Not Only SQL), for BigData Analytics, for Cloud computing or as domain specific languages (DSL) embedded in a host language, most of them share a common subset of SQL and/or the ability to handle semistructured data.

Such languages can greatly benefit from formal uniform foundations, and we argue that such foundations should account for novel features critical to various application domains. Also, most of those languages provide limited type checking, or ignore it altogether. We believe type checking is essential for many applications, with usage ranging from error detection to optimization.

In this context one of my favorite research project is the design and development of ℂDuce an XML-centric general purpose functionnal programming language developed under an MIT license. ℂDuce is a language for type-safe and fast query and transformation of XML documents. Related Grant: ANR project Blanc SIMI2 Typex (Typeful certified XML: integrating language, logic, and data-oriented best practices).

More precisely, we are currently designing navigational ℂDuce (or, ℂDuce+XPath). ℂDuce currenty provides a limited form of navigational patterns which does not comply with the XPath standard. We are implementing a new version which integrates with ℂDuce patterns, is XPath-compliant, and is precisely typed even for backward axes. Details in this article .

In the same line of research, I am also currently working on NoSQL languages very popular in the context of big data and/or cloud computing . The aim is to define a general framework that can both express and type such languages via an encoding into a core calculus. Each such language can in this way preserve its execution model but obtain for free a formal semantics, a type inference system and, as it happens, a prototype implementation. Here is our work in progress published at POPL 2013 Static and dynamic semantics for NoSQL Languages

Programmers who build web-based or cloud-based applications and that perform BigData analytics tie together data coming from very heterogeneous sources such as sensor networks, activity logs, spreadsheets, databases, social networks. The data is stored in different heterogeneous formats, in a weakly structured form and may be streamed in real time. Such data clashes with the traditional database framework since they are not uniform enough to be stored and structured enough to be queried. As a consequence recent times have seen an explosion of definitions of specialized languages or APIs to query specific formats, that are then hosted in general purpose languages, many of which are dynamic languages such as Python, Ruby, JavaScript, or even specific purpose ones, such as R. One of the consequences of this new modus operandi is that queries rely on user defined functions (UDF) that are defined in the syntax of the host languages. Since data engines are not able to interpret these UDF, then the consequence is a round-trip between the data engine and the interpreter of the host language with dramatic consequences on performances: simple queries in the host language can create query avalanches of multiple repeated queries. In this way, the initial query is hashed in several distinct queries which can be barely optimized by the query engine.

This research line, based on a collaboration with Oracle Labs (US), seeks to address the above issues by making the querying interface of data providers effectively multi-lingual. The goal is to define an intermediate representation of queries (QIR) that is common to application programming languages and data providers of dif- ferent nature (e.g., Relational DBMS, key-value stores, map-reduce data stores, XML/JSon/RDF and other NoSQL databases, etc...). Database systems supporting the QIR interface can execute queries requested in QIR form, without requiring applications to translate them first into their main declarative querying interface (e.g., SQL, HiveQL, XQuery).

Together with Évelyne Contejean and Stéfania Dumbrava we are curently workink on the formalisation in Coq of data intensive management systems in the context of the Datacert: towards data certification project supported by ANR (2016-2021). The aim is to certify and verify, as well, data intensive systems such as RDBMS's and/or XML processing engines with the Coq proof assistant and the Why(3) platform. Preliminary work has been published at ESOP 2014 and is available here

Internships Proposals

Here are some internships proposals around Coq and Data Intensive Programming Langages and Systems.

  • I am delighted to be a co-organiser (with J.Cheney, T. Grust and D.Vitinyotis) of the upcoming Dagstuhl Seminar on Programming Languages for Big Data (PlanBig) (proposal 26-0613) which will take place from monday, December 15 to friday, December 19, 2014 in Dagstuhl Schloss.

Past Events (From 2002)

International Events

National Events

In Memoriam

Le 18 décembre 2010, Madame Jacqueline de Romilly , helléniste, s'en est allée.

Membre de l'Académie française, première femme professeur au Collège de France, elle est connue pour ses travaux sur la civilisation et la langue de la Grèce antique, et en particulier pour ses travaux à propos de Thucydide.

Elle disait d'elle-même ne pas avoir eu, «bien sûr», la vie qu'elle souhaitait :

« Avoir été juive sous l'Occupation, finir seule, presque aveugle, sans enfants et sans famille, est-ce vraiment sensationnel ? Mais ma vie de professeur a été, d'un bout à l'autre, celle que je souhaitais. »

O horror, horror, horror! Tongue nor heart Cannot conceive nor name thee!