The Semantic Web lies in the continuity of the Web. It aims to facilitate the information retrieval and improve the agents interoperability (whether human or software) currently offered by the Web by formalizing the knowledge contained in the unstructured content of the Web; that is, information created and understood by humans but not yet usable by machines. The Semantic Web bases the description of the content of the resources on languages understandable as much by human as by machine and offering a better expressiveness than the hypertext format. Access to this knowledge and the improvement of its treatment should lead to improved information retrieval by alleviating the sometimes arduous operations of ranking, crosschecking and selection of results.
To achieve this, the Semantic Web relies on annotations and metadata referred as « semantic » in the sense that they express complex information exploited by sophisticated mechanisms to provide « intelligent » services to the user. These mechanisms are often inspired by methods developed in the field of artificial intelligence, and more specifically the field of knowledge representation and reasoning (abbreviated KRR). More specifically, the Semantic Web aims at :
- formalizing the function of web services or the content of Web resources by metadata or annotations formulated in a sufficiently expressive description language, whose vocabulary can be defined by an ontology and whose operation by a language of knowledge representation provides it with reasoning skills (if not already)
- further developing homogeneous access to the heterogeneous content of the Web by means of mediation or aggregation of content possibly using annotation processes to structure the content;
- qualifying the knowledge considered to assess their relevance;
- and by ensuring that the processes implemented scale up, ie they are applicable to the Web as a whole and sufficiently withstand the information burden induced by the abundance of resources.
And to fulfill this goal, the research is mainly focused on :
- knowledge engineering, which studies ways of representing human knowledge and automating reasoning on that knowledge by building on the results achieved in CPR while adapting them to a specific context of use;
- linguistic engineering dedicated to methods of analyzing the literal content of Web resources,
- the human-machine interaction that studies the conditions favoring the exchange between a system and its user (and vice versa).