D2.1 State of the Art on Engines Deliverable Type * Nature of Deliverable ** Version : Released Created : 23-11-2007 Contributing Workpackages Editor : Contributors/Author(s) * Deliverable type: PU = Public, RE = Restricted to a group of the specified Consortium, PP = Restricted to other program participants (including Commission Services), CO= Confidential, only for members of the CHORUS Consortium (including the Commission Services) ** Nature of Deliverable : P= Prototype, R= Report, S= Specification, T= Tool, O = Other. Version : Preliminary, Draft 1, Draft 2,…, Released

Abstract:

Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.

The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.

From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research.

Keyword List: multimedia, search, content based indexing, benchmarking, mobility, peer to peer, use cases, socio-economic aspects, legal aspects CHORUS Project Consortium

Table of Contents TABLE OF CONTENTS ...... 3 1. INTRODUCTION...... 6 2. USER INTERACTION ...... 7 2.2.1. User centered design ...... 8 2.2.2. The HCI perspective ...... 8 2.4.1. Beyond "relevance" as a target notion - How can we formalise our understanding of what users are up to? 10 2.4.2. Use cases as a vehicle...... 11 2.10.1. Actions...... 15 2.10.2. Corpora...... 17 2.10.3. Methodologies (Technical Requirements)...... 19 2.10.4. Products ...... 21 2.10.5. Users ...... 24 2.10.6. User Classes...... 26 3. STATE OF THE ART IN AUDIO-VISUAL CONTENT INDEXING AND RETRIEVAL TECHNOLOGIES...... 29 3.2.1. Learning and Semantics...... 31 3.2.2. New Features & Similarity Measures...... 33 3.2.3. 3D Retrieval...... 35 3.2.4. Browsing and Summarization...... 36 3.2.5. High Performance Indexing...... 36 3.3.1. Multimedia Analysis in European Projects...... 37 3.3.2. Multimedia Analysis in National Initiative ...... 38 3.3.3. State-of-the Art in European Research ...... 39 ANNEX B: OVERVIEW OF THE NATIONAL RESEARCH PROJECTS ...... 74 4. SOA OF EXISTING BENCHMARKING INITIATIVES + WHO IS PARTICIPATING IN WHAT (EU&NI) ...... 78 TrecVid...... 81 ImageClef...... 83 ImageEval ...... 84

TechnoVision-ROBIN...... 85 IAPR TC-12 Image Benchmark...... 86 CIVR Evaluation Showcase...... 88 SHREC (3D)...... 89 MIREX...... 90 INEX...... 91 Cross-Language Speech Retrieval (CL-SR)...... 93 NIST Spoken Term Detection ...... 94 Nist Rich Transcription...... 95 ANNEX I: EVALUATION EFFORTS (AND STANDARDS) WITHIN ONGOING EU PROJECTS AND NATIONAL INITIATIVES...... 98 ANNEX II: RELATED CHORUS EVENTS TO BENCHMARKING AND EVALUATION ...... 104 5. P2P SEARCH, MOBILE SEARCH AND HETEROGENEITY ...... 106 5.2.1. Introduction ...... 106 5.2.2. Context...... 107 5.2.3. Main players ...... 110 5.2.4. The State of the Art ...... 110 5.3.1. Introduction ...... 120 5.3.2. Context...... 122 5.3.3. Main players ...... 124 5.3.4. The State of the Art ...... 125 6. ECONOMIC AND SOCIAL ASPECTS OF SEARCH ENGINES ...... 132 6.2.1. An Innovation-based Business ...... 133 6.2.2. Issues with the Advertising Model ...... 141 6.2.3. Adjacent Markets ...... 143 6.3.1. Patterns...... 146 6.3.2. The Web 2.0 Context...... 148 6.3.3. Privacy, Security and Personal Liberty ...... 150 6.3.4. Result Manipulation ...... 153 6.3.5. The public responsibility of search engines...... 154 6.4.1. Overview...... 156 6.4.2. World-wide Players ...... 158 6.4.3. Regional Champions...... 160 6.4.4. European Actors ...... 162 7. SEARCH ENGINES FOR AUDIO-VISUAL CONTENT: LEGAL ASPECTS, POLICY IMPLICATIONS & DIRECTIONS FOR FUTURE RESEARCH...... 166 7.2.1. Four Basic Information Flows...... 169 7.2.2. Search Engine Operations and Trends ...... 171 7.3.1. The Centrality of Search ...... 174 7.3.2. The Adapting Search Engine Landscape ...... 175 7.3.3. Extending Beyond Search ...... 177 7.4.1. Copyright in the Search Engine Context...... 178 7.4.2. Trademark Law...... 184 7.4.3. Data Protection Law...... 186 7.5.1. Increasing Litigation in AV Search Era: Law as a Key Policy Lever...... 191 7.5.2. Combined Effect of Laws: Need to Determine Default Liability Regime...... 194 7.5.3. EU v. US: Law Impacts Innovation In AV Search ...... 199 7.7.1. Social Trends ...... 205 7.7.2. Economic trends ...... 205 7.7.3. Further Legal Aspects...... 206 ANNEX TO CHAPTER 3: SUMMARY AND GOALS OF USE CASES...... 209

1. INTRODUCTION

2. USER INTERACTION

2.1. Summary

2.2. User involvement in the system development process

2.2.1. User centered design

2.2.2. The HCI perspective

2.3. User generated content

Commercial content Personal Content Professional People pay to consume People pay to share Content End-users Prosumers Creators

# content consumers = Everybody Semi-Professionals

Popularity One -to-Many One -to-’A Few’ Everybody-to-’A Few’ of content

Home Video Professional Regional & VRT Ketnet Kick Personal Photo- shows and Broadcasts Local Com- Children create TV: birthday sharing Video Diaries (news, sports, munity TV content shown on TV videos, … Blogging shows, …)

2.4. User centered design of multi-medial information access services

2.4.1. Beyond "relevance" as a target notion - How can we formalise our understanding of what users are up to?

relevance

2.4.2. Use cases as a vehicle

2.5. Use Cases in Current Research Projects

2.6. Use-cases and Services

2.7. Overview: The CHORUS Analysis of Use Cases

• • •

• • • •

2.8. Participating Projects

2.9. Dimensions of Data Analysis

• • • • • •

2.10. Data Analysis

2.10.1. Actions

Summary Retrieval (Browse) Retrieval (Search) Retrieval (Browse) Content Delivery Classification Systems

RETRIEVAL 48 CLASSIFICATION SYSTEMS 40 Analysis 11 OTHER 8 Table 1: Action Actions

Retrieval 31 11 3 3

Classification Systems 30 6 4

Analytics 8 2 1

Other 5 3

0 10 20 30 40 50 60 Analysis Retrieval Classification

Systems Classification Systems Extraction/Indexing Retrieval (General) Search Browse Content Delivery Analysis Multimedia Analysis Text Analysis Multimedia Analysis

2.10.2. Corpora

Summary Corpora Corpus Action Corpus Corpora Multimedia Text Action

Table 2: Corpora Corpora

Multimedia 52 16 1

Text 7 6 3 2 1 1

Other 8 6

0 10 20 30 40 50 60 70 80

Analysis Multimedia Text

2.10.3. Methodologies (Technical Requirements)

Summary Query by Multimedia Extraction Query by Text Semantic Classification Statistical Classification Controlled Controlled Metadata (Profile) Query by Keyword Semantic Classification Query by Text Query by Multimedia

Table 3: Methods (Requirements)

Methods (Technical Requirements)

Other 28 10

Query by Text 9 4 2

Semantic Classification 4 2 8

Extraction 6 5 3

Query by Multimedia 10 2

Miscellaneous 8

Statistical Classification 2 1 1 1 1

0 5 10 15 20 25 30 35 40 Analysis Methodology Other Query by Text Semantic Classification Extraction Query by Multimedia GUI Development

2.10.4. Products

Summary Retrieval Systems Social Sharing System Targeted Delivery System Classified Content Indexed Content Multimedia Segments Classification System Tools Generators, Taxonomy Generator Index Generator, Taxonomy Generator Concordance Generators Human-Computer Interaction Indexing Manager/Editors Classification System Tools Digital Asset Manager Indexing Manager/Editor Digital Asset Manager

Products

Retrieval Systems 34 3 2 1

Classified Content 8 6 3 2 2 1

Classification Systems Tools 8 3 3 1 1 1 1 1

Other 12 5

Human-Computer Interaction 5

Indexing Manager/Editor 2 1 1

0 5 10 15 20 25 30 35 40 45 Analysis Retrieval Systems Classified Content, Classification Systems Tools, Indexing Manager/Editor Retrieval Systems (General) Classified Content Classification Systems Tools Index Generators Controlled Vocabulary Development Standards Development

Classification Systems Tools Human-Computer Interaction Indexing Manager/Editor

2.10.5. Users

Summary Tourism/Heritage Travel Travel

Users

Art Director 3%

Designers 5% Archivist 34%

Consumer 5%

Tourism/Heritage 10%

Researcher 15% Journalist 19%

Analysis

2.10.6. User Classes

Summary User Classes User Classes

End User 28 18 16

Other User 46

Other 6

0 10 20 30 40 50 60 70 Analysis End Users User Class Content Providers,

Content Providers End Users End User (Vague).

2.11. Conclusion and Future Prospects

2.12. References

Agenda, viewgraphs and minutes of workshop 2 : National Initiatives on Multimedia Content Description and Retrieval”, Geneva, October 10th, 2007” Information Research, Agile software development. Object-Oriented Software Engineering: A Use Case Driven Approach

Proceedings of the 23rd ACM Sigir Conference on Research and Development of 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006 Proceedings of the ACM Workshop on Multimedia Information Retrieval Journal of the American Society for Information Science Interacting With Computers Evaluation frameworks for interactive multimedia information retrieval applications. Usability Engineering

3. STATE OF THE ART IN AUDIO -VISUAL CONTENT INDEXING AND RETRIEVAL TECHNOLOGIES

3.1. Introduction

3.2. Recent Work

3.2.1. Learning and Semantics

3.2.1.1. Concept Detection in Complex Backgrounds

presence of complex backgrounds

3.2.1.2. Relevance Feedback

relevance feedback emergent semantics sunsets.

3.2.2. New Features & Similarity Measures

3.2.3. 3D Retrieval

spin-images

3.2.4. Browsing and Summarization

3.2.5. High Performance Indexing

3.3. Summary of Multimedia Analysis in European Research

3.3.1. Multimedia Analysis in European Projects

Speech/Audio Image 3D Video Text/Semantics DIVAS PHAROS RUSHES SAPIR SEMEDIA TRIPOD VICTORY VIDI- VIDEO VITALAS

3.3.2. Multimedia Analysis in National Initiative

Speech/Audio Image 3D Video Text/Semantics Quaero (French) Theseus (German) iAD (Norway) MultimediaN (Dutch) IM2 (Swiss) Mundos (Spanish)

3.3.3. State-of-the Art in European Research

State-of-the-Art: Speech Analysis

 -  - - -

 - -  - - -  - - - - State-of-the-Art: Music Analysis

 -  - -  - -  - - - -

 - - -

State-of-the-Art: Image Analysis

 -  - - -  - -  - - - -  - - - State-of-the-Art: Video Analysis

 -

 - - - - - -  - -  - - - - -  - - - -

State-of-the-Art: Text/Semantic Analysis

 -

 - - -  - -  - - - -  -

3.4. Future Directions

3.5. References

CVIU 96(2) ACM MIR, SPIE: Storage and Retrieval for Still Image and Video Databases IEEE Transactions on Pattern Analysis and Machine Intelligence 27(9) Computer Vision In CIVR IEEE Transactions on Pattern Analysis and Machine Intelligence The Search: How and Its Rivals Rewrote the Rules of Business and Transformed Our Culture CVPR ACM Multimedia ICME IEEE Trans. on Pattern Analysis and Machine Intelligence 23(10) Proceedings of IEEE International Conference on Data Engineering In CIVR IEEE Transactions on Speech and Audio Processing, Special Issue on Spontaneous Speech Processing IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9) ICIP ICIP ACM Multimedia The Visual Computer 18(2) ACM Transactions on Multimedia Computing, Communications, and Applications European Signal Processing Conference IEEE Multimedia 9(3) CIVR IEEE Multimedia 9 IEEE Transactions on Knowledge and Data Engineering 15(1)

CVPR IEEE Transactions on Pattern Analysis and Machine Intelligence 25(3) CIVR, . In Proceedings of the International Conference on Visual Information Systems Database Theory ACM Transactions on Information Systems 18(2) ACM Multimedia. Journal of Visual Languages and Computing Hybrid visual and conceptual image representation within active relevance feedback context, IEEE Computer ACM Multimedia Systems 7(1) ICME International Journal of Computer Vision 32(1) University of Chicago Technical Report 96-14 ACM Transactions on Graphics 22(1) Proc. roadcastNews Workshop Principles of Visual Information Retrieval IEEE Transactions on Pattern Analysis and Machine Intelligence 26(3) ICME Image Databases and Multimedia Search

ACM MIR, Image Databases and Multimedia Search Computer and Robot Vision 4th Alvey Vision Conference ACM Multimedia In CIVR IEEE Transactions on Pattern Analysis and Machine Intelligence 22(6) IEEE Transactions on Pattern Analysis and Machine Intelligence 27(6) ACM Int. Conf. on Multimedia Principles of Visual Information Retrieval Int. Conf. on Image and Video Retrieval IEEE Transactions on Knowledge and Data Engineering 16(10) Vision in Man and Machine Proceedings of the International Conference on Pattern Recogntion IEEE Computer Principles of Visual Information Retrieval ACM Transactions on Multimedia Computing, Communication, and Applications, Image and Vision Computing 19 IEEE Transactions on Pattern Analysis and Machine Intelligence 25(9)

International Journal of Image and Graphics 1(3) International Journal of Computer Vision 30(2) Image and Vision Computing, 15(6) ACM Multimedia, In CIVR International Journal International Journal of Computer Vision, 60(1) ACM MIR International Journal of Image and Graphics ACM International Conference on Image and Video Retrieval. The Smart Retrieval System: ACM Transactions on Multimedia Computing, Communications, and Application 1(1), Advances in Neural Information Processing Systems 8, Principles of Visual Information Retrieval

In CIVR IEEE Transactions on Pattern Analysis and Machine Intelligence 26(5) IEEE Transactions on Pattern Analysis and Machine Intelligence 22(10) Pattern Recognition Letters 22(2) In CIVR In CIVR ACM Multimedia IEEE Transactions on Pattern Analysis and Machine Intelligence 22(12) IEEE Multimedia ACM Multimedia IEEE Transactions on Pattern Analysis and Machine Intelligence 27(4) CVPR , In Proceedings of International Conference on Shape Modeling and Applications Journal of Electronic Imaging International Journal of Computer Vision 56(1) British Machine Vision Conference Principles of Visual Information Retrieval International Journal of Image and Graphics 1(3) CIVR ACM Multimedia,

Annex A: Overview of the 9 IST projects

Project name Divas Objectives Modalities Project name RUSHES Objectives Modalities

Project name SAPIR Objectives Modalities

Project name SEMEDIA Search Environments for Media Objectives To develop techniques to extract metadata from ‘essence’

Modalities Project name Tripod Objectives

Modalities Project name VICTORY Objectives

Modalities Project name Objectives

Modalities Project name VITALAS

Modules on Speech/Audio Indexing and Retrieval

module/task Music Segmentation module/task Speech Segmentation module/task Audio retrieval

module/task

module/task Speech Mining and Segmentation module/task Jingle detection module/task Audio Analysis

Modules on Image Indexing and Retrieval

module/task module/task Image module/task Similarity search in large datasets module/task Objects and visual concepts recognition

module/task Visualization maps

Modules on 3D Indexing and Retrieval

module/task 3D video scene description module/task 3D Search engine

Modules on Video Indexing and Retrieval

module/task Video segmentation, indexing and search

module/task Relevance feedback module/task AV information retrieval module/task Video annotation and summarisation

module/task Video segmentation

Quick overview of media including sparsely module/task annotated material Low level indexing for efficient searches of A/V module/task databases

Module/task Efficient combination of metadata sources Data architectures and security in networked Module/task media environments Module/task Media mining techniques Interface design for context-aware adaptive Module/task search, browsing and annotation

Module/task Module/task Feedback-Only Search

Module/task Prototypes in Broadcast Media Environments Module/task Integrated retrieval and mining models

Prototypes for media access, search and Module/task retrieval in web-based communities Module/task Visual analysis Module/task Learning

module/task Rigid local entities retrieval

module/task Large set of cross-media concepts extraction

Modules on Text Indexing and Retrieval

module/task Semantic reasoning module/task Text semantic retrieval module/task Text

Caption augmentation for images with existing module/task captions

module/task Text search module

module/task Word sense disambiguation

Annex B: Overview of the national research projects

Project Quaero • • • Private companies Public research labs Public institutions • • • • • • • • • • • • • Project Theseus Industry: Research and public organisations:

• • • • • • • • • • • • Project iAD – information access disruptions • • • • • • • • • • • Schema agnostic indexing services • • Processing high-speed data streams • • Scalable infrastructure for push and pull based computing • Extreme precision and recommendation in multimedia access • • Understanding and managing the disruptive potential of iAD • •

• • • • • • • • • • • • • • • • • • • • Project Interactive Multimodal Information Management (IM2) • • • •

• • • • • • • • • • • • • • • •

4. SOA OF EXISTING BENCHMARKING INITIATIVES + WHO IS PARTICIPATING IN WHAT (EU&NI)

4.1. Introduction and WG2 objectives:

JAMES Z. WANG, NOZHA BOUJEMAA, ALBERTO DEL BIMBO, DONLAD GEMAN, ALEXANDER G. HAUPTMANN AND JELENA TESIC. (2006). Panel: Diversity in Multimedia Information Retrieval Research. In Proceedings of the 8th ACM international workshop on Multimedia information retrieval. 5-12 • Establishing a common dataset training set test set ,

• Definition of the task to be performed

• Establishing of the ground truth

• Assessing results relative to the ground truth.

4.2. Overview of existing benchmark initiatives

• Definition of tracks and tasks

• Evaluation metrics of task

• Type and size of the data used • Method used for generating the ground-truth

• Participation statistics • URL, • Conclusion,

TrecVid Definition of tracks and • tasks (2007) • • Maximum duration to be determined Summaries can contain picture-in-picture, split screens • • • • Evaluation metrics of tasks Type and size of data used Data unavailable

Method used for the • generation of the ground- • truth • • • • • • Participation statistics

URL

Conclusion from TrecVid • perspective • • • • • •

ImageClef Definition of tracks and • tasks (2007) • • • • • Evaluation metrics of • tasks • • Type and size of data used • • • • • Method used for the • generation of the ground- truth • • Participation statistics

URL

Conclusion ● ● ● ● ●

ImageEval ●

Definition of tracks and ● tasks (2007) ● ● ● ●

Evaluation metrics of ● tasks ● ●

Type and size of data ● used ● ● ● ● ● ● Method used for the generation of the ground-truth Participation statistics Data unavailable Data unavailable URL Conclusion • • • • •

TechnoVision-ROBIN Definition of tracks and • tasks (2007) • • •

Evaluation metrics of tasks • • • Type and size of data used • • • • • • Method used for the generation of the ground- truth Participation statistics Data unavailable URL Conclusion

IAPR TC-12 Image Benchmark Definition of tracks and tasks (2007) • • • Evaluation metrics of tasks • •

Type and size of data used • • • Method used for the • generation of the ground- truth • Participation statistics Data unavailable URL Conclusion • • • • •

CIVR Evaluation Showcase • Definition of tracks and tasks (2007) • • Evaluation metrics of • tasks • N N • Type and size of data • used • •

Method used for the • generation of the ground-truth • • Participation statistics Data unavailable Data unavailable URL Conclusion • • • • •

SHREC (3D) Definition of tracks and • tasks (2007) • • • • • • Evaluation metrics of • tasks • • • • Type and size of data used Method used for the generation of the ground- truth Participation statistics Data unavailable URL

Conclusion

MIREX Definition of tracks and • tasks (2007) • • • • • • • • • • • • Evaluation metrics of • tasks • Type and size of data used Method used for the generation of the ground-truth Participation statistics Data unavailable URL

Conclusion • • •

INEX Definition of tracks and MMfragments task: tasks (2007) • • • MMimages task Evaluation metrics of MMfragments task tasks MMimages task

Type and size of data used Wikipedia XML collection Wikipedia image collection Wikipedia image XML collection Image classification scores Image features Method used for the generation of the ground-truth MMfragments task MMimages task Participation statistics URL

Synthesis and • conclusion ● ●

Cross-Language Speech Retrieval (CL-SR) Definition of tracks and o tasks (2006) o o Evaluation metrics of o tasks Type and size of data o used o o o o o Method used for the • generation of the ground-truth Participation statistics o

o o

o URL

NIST Spoken Term Detection term Definition of tracks and term tasks (2006) term Evaluation metrics of tasks Type and size of data used

Method used for the generation of the ground- truth

Participation statistics o

o

o URL

Nist Rich Transcription

Definition of tracks and o tasks (2007) o o o o Evaluation metrics of o tasks o o o Type and size of data o used o

Method used for the o generation of the ground-truth

Participation statistics o o

o

o URL

4.3. Conclusion • • • • • • • •

Annex I: Evaluation efforts (and standards) within ongoing EU Projects and National Initiatives

Project names: MultimediaN & VidiVideo • • • • • • •

• • • •

Project name: SAPIR • • • • • •

• • • Perceived effectiveness • Perceived efficiency • Perceived satisfaction • • •

Project name: Tripod • • • • • muscle

• • • • • • • • • • • • • •

Project name: AIM@SHAPE Michela Spagnuolo • digital 3D objects • depending on the object represented • • SHREC: 3D Shape Retrieval Contest, international initiative

• http://www.aimatshape.net/event/SHREC/ • the contest is organized in tracks, each for a specific 3D retrieval task, either in terms of retrieval method (eg, partial/global) or shape type (eg protein/CAD models) • For each query there exists a set of highly relevant items and a set of marginally relevant items. Therefore, most of the evaluation measures have been split up as well according to the two sets. Measures used: true and false positives, true and false negatives, first and second tier, precision, recall, average precision, average dynamic recall, cumulated gain vector, discounted cumulated gain vector, normalized cumulated gain vector (see Section 4 of the attached SHERC06.PDF for a complete description of the performance measures) • Manually, by track organizers • In the first two SHREC contests, they have been maintained by the organizers; we are considering the possibility to maintain them directly in the ShapeRepository of the AIM@SHAPE project (see shapes.aimatshape.net) • AIM@SHAPE is organizing the contest and more precisely, Remco Veltkamp, UU (email: [email protected]

• • • • not applicable, the contest is meant for a scientific audience none • • • •

Project name: VITALAS ● ●  

●   ● ●    ● not defined yet ● ● not defined yet ● not defined yet ● ●

Annex II: Related Chorus Events to Benchmarking and Evaluation

NAVS Chorus cluster, March 14th 2007

Panel discussion:

CBMI Chorus Panel: June 25th 2007 Bordeaux Topic: Benchmarking Multimedia Search Engines

• • •

5. P2P SEARCH , MOBILE SEARCH AND HETEROGENEITY

5.1. Introduction

5.2. P2P search

5.2.1. Introduction

Figure 1

Figure 1: • Peer discovery.

• Querying peers for content • Sharing content with other peers.

5.2.2. Context

• Flooding broadcast queries

• Selective forwarding systems

• Decentralized hash table networks

• Centralized indexes and repositories

• Distributed indexes and repositories

• Relevance driven network crawlers

5.2.3. Main players

5.2.4. The State of the Art

5.2.4.1. The scalability viewpoint

5.2.4.2. and Text only

• Searching in unstructured P2Ps

• Iterative deepening

• k-walker random walk and related schemes

• Directed BFS and intelligent search

• Local indices based search

• Routing indices based search

• Attenuated bloom filter based search

• Adaptive probabilistic search

• Dominating set based search

• Searching in strictly structured P2Ps

• Searching in hierarchical DHT P2Ps

• Kelips

• Coral and related schemes

• Other hierarchical DHT P2Ps

super peers peers

• Searching in non-DHT P2Ps

• Searching in loosely structured P2Ps

5.2.4.3. Audiovisual search

5.3. Mobile search

5.3.1. Introduction

• Personalized search and Context information

• Results page layout

• Local & vertical results

• Character limits

• URL display

• Recommendation

5.3.2. Context

• Presentation of results

• Search box location

• Local & vertical results

• Location setting

• Keyword bolding

• User agent detection

• Transcoding

• the Impact on mobile Search Engine Optimizer?

5.3.3. Main players

• Nouveau Directory Assistance &

• Text-Based

• WAP Local Search

• Local Mobile Applications

• Bringing It All Together

.

5.3.4. The State of the Art

• Interfaces

• Community-based search

• Results Classification

• Contextual search

5.4. References

Design Alternatives for Large-Scale Web Search: Alexander was Great, Aeneas a Pioneer, and Anakin has the Force , Efficient Peer-To-Peer Searches Using Result-Caching , Homepage Finding in Hybrid Peer-to-Peer Networks, 8th RIAO Conference on Large-Scale Semantic Access to Content

Performance analysis of distributed information retrieval architectures using an improved network simulation model MoBe: Context-Aware Mobile Applications on Mobile Devices for Mobile Users Proc. of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS’03) Adaptive probabilistic search in peer-to-peer networks A comparison of peer-to-peer search methods Supporting information retrieval on mobile devices Hierarchical Approach to Learning Context and Facilitating User Interaction in Mobile Devices A large scale study of wireless search behavior: Google mobile search , A facet-based interface for mobile search . Managing Context Information in Mobile Devices On the Feasibility of Peer-to-peer and Search Three-Level Caching for Efficient Query Processing in Large Web Search Engines Content-based peer-to-peer network overlay for full-text , ALVIS Peers: A Scalable Full-text Peer-to-Peer Retrieval Engine “P2p content search: Give the Web back to the people, Building efficient and effective metasearch engines

A pipelined architecture for distributed text query evaluation, WebCarousel: Restructuring Web search results for passive viewing inmobile environments , Size Doesn’t Always Matter: Exploiting PageRank for Query Routing in Distributed IR (ACM ICS’02) Proc. of the 21st Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM’02) Dynamic queries for visual information seeking. Distributed Cache Table: Efficient Query-Driven Processing of Multi-Term Queries in P2P Networks An Exploration in Personalized and Context-Sensitive Search , Hybrid Global-Local Indexing for Efficient Peer-to-Peer Information Retrieval A local search mecha nism for peerto- peer networks Proc. of the 11th ACM Conference on Information and Knowledge Management (ACM CIKM’02) Self-organizing distributed collaborative filtering Open source, distributed and Peer to Peer IR, Searching Techniques in Peer-to-Peer Networks Optimized Inverted List Assignment in Distributed Search Engine Architectures Efficient Query Evaluation on Large Textual Collections in a Peer-to-Peer Environment

6. ECONOMIC AND SOCIAL ASPECTS OF SEARCH ENGINES

6.1. Introduction

´´

Communicating Free VoIP

MSN Hotmail Instant Gmail Messengers Chat Gmail Free Emails Yahoo Yahoo

Picasa Flic Providing Content Google Yaho Sharing Yahoo o Pics & Music MSN Ork Blogs Search Engines Social Wikipedia Networks

Producing Content Figure 2: Google, Yahoo and operate a number of services that render them key players as enables of content producers, as information providers and as communication facilitators.

6.2. Economic Aspects

6.2.1. An Innovation-based Business

The Coming of Post-Industrial Society” See ZDNet at The expanding Digital Universe "How Much Information" “Are new firms an important source of innovation?” Economic Letters

´ Table 1: Revenue for Google in the period 2003 in thousand US$. Source: Google Annual Report 2005 and 2006. Information facilitated to the US securities and Exchange Commission.

"Google: What it is and what it is not"

Table 2: Google's advertising and licensing revenues by share. Source: Google Annual Report 2005 and 2006. Information facilitated to the US securities and Exchange Commission. "Google: What it is and what it is not" Berichtsband – zur facts 2007 'Search Engine Users: Internet searchers are confident, satisfied and trusting – but they are also unaware and naïve.'

6.2.1.1. Online Advertising

“Google becomes an entertainment company” 'The good, the Bad and the Ugly of the search business' ´

6.2.1.2. The Web Search Engines Landscape

"Wikipedians Promise New Search Engine" "Search Advertising"

Berichtsband – zur internet facts 2007 Erreur ! Source du renvoi introuvable.

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Jul-01 Jul-02 Jul-03 Jul-04 Jul-05 Jul-06 Jul-07 Jan-02 Jan-03 Jan-04 Jan-05 Jan-06 Jan-07

Google Yahoo! MSN T-Online AOL Other Figure 3 of WebHits for search engines in Germany in the period 2001 to 2007. Source: WebBarometer, 31 [Speck 2007] and own calculations

100%

80%

60%

40%

20%

0%

2 02 0 03 04 05 06 07 - t- pr pr-03 Oct-01 A Oct- A Oct- Apr-04 Oct- Apr-05 Oct- Apr-06 Oc Apr-

Google Yahoo! Voila Live AOL Others Figure 4: Evolution of WebHits for search engines in France in the period October 2001 to September 2007. Source: Baromètre Secrets2Moteurs 32 and own calculations

Internetverbreitung in Deutschland: Potenzial vorerst ausgeschöpft? "Search Advertising"

6.2.2. Issues with the Advertising Model

6.2.2.1. Conflicting Interests

´´

6.2.2.2. The content quality problem

The good, the Bad and the Ugly of the search business'

´

“Markets and Privacy” 'Economics and Search" “Economic aspects of personal privacy”

6.2.2.3. Self-bidding and Click-Fraud

6.2.3. Adjacent Markets

6.2.3.1. Search Engine Optimization

'The good, the Bad and the Ugly of the search business' "Click Fraud – An overview" "Click Fraud looms as Search Engine Threat"

6.2.3.2. Business Search Solutions

"Internet advertising: Is anybody watching?" "Search Advertising"

6.2.3.3. Mobile Search

6.3. Social Aspects

6.3.1. Patterns

6.3.1.1. User behaviour patterns

Berichtsband – zur internet facts 2007 Internetverbreitung in Deutschland: Potenzial vorerst ausgeschöpft? 'Search Engine Users: Internet searchers are confident, satisfied and trusting – but they are also unaware and naïve.' 'Search Engine Users: Internet searchers are confident, satisfied and trusting – but they are also unaware and naïve.'

6.3.1.2. Product Search

´ Voters Believe Media Bias is Very Real’ 'Search Engine Users: Internet searchers are confident, satisfied and trusting – but they are also unaware and naïve.' 'Web Search: Public Searching of the Web' Berichtsband – zur internet facts 2007

6.3.1.3. Vertical Search Engines

6.3.2. The Web 2.0 Context

6.3.2.1. Communities developing Search Engines

´´´´

6.3.2.2. Communities tagging and filtering audiovisual content

6.3.3. Privacy, Security and Personal Liberty

6.3.3.1. Profiling of Individuals

“The future of the internet is not the internet: open communications policy and the future wireless grid(s)”

6.3.3.2. Censorship

'The good, the Bad and the Ugly of the search business' "Google: What it is and what it is not" 'Esse est indicato in Google: Ethical and Political Issues in Search Engines'

6.3.3.3. Racism and the Protection of Youth

"Empirical Analysis of Internet Filtering in China 'Esse est indicato in Google: Ethical and Political Issues in Search Engines'

6.3.4. Search Engine Result Manipulation

"Google Doesn't Like 911truth.org",

6.3.5. The public responsibility of search engines

6.3.5.1. Education and Learning

6.3.5.2. Are Search Engines a public good?.

'The good, the Bad and the Ugly of the search business' "The Future of ICT and Learning in the Knowledge Society",

'Esse est indicato in Google: Ethical and Political Issues in Search Engines' 'Shaping the Web: Why the Politics of Search Engines Matters'

6.4. Annex: Profiles of Selected Search Engine Providers

6.4.1. Overview

Table 3: List of relevant search engines by area of operation. Source Pandia (www.pandia.com). Accessed 18/10/2007 Table 4: Table audio-visual search engines. Source Pandia (www.pandia.com) accessed 18/10/2007.

Ask MSN Search Engine Provider Google Yahoo! LiveSearch CitySearch Table 5: Comparison of the major search engine providers in terms of financial data. Note that data are for the parent organization or the search engine provider, which in the case of Microsoft and IAC have also other important business operations. Source: Yahoo! Finance, and Company Reports

6.4.2. World-wide Players

6.4.2.1. Google (USA)

6.4.2.2. Yahoo! (USA)

6.4.2.3. MSN Live Search, Microsoft (USA)

6.4.2.4. Ask.com, Excite, CitySearch, (USA)

6.4.3. Regional Champions

6.4.3.1. Baidu (China)

6.4.3.2. Yandex (Russia)

6.4.3.3. Rambler (Russia)

6.4.4. European Actors

6.4.4.1. Fast (Norway)

6.4.4.2. Exalead (France)

6.4.4.3. NetSprint (Poland)

6.4.4.4. Morfeo (Czech Republic)

6.4.4.5. Autonomy (United Kingdom)

6.4.4.6. Expert System (Italy)

7. SEARCH ENGINES FOR AUDIO -VISUAL CONTENT : LEGAL ASPECTS , POLICY IMPLICATIONS & DIRECTIONS FOR FUTURE RESEARCH

7.1. Introduction

. • intellectual property rights • competition law • media law • e-commerce law • communications law • law of obligations • criminal law • constitutional law & fundamental rights

7.2.

See The Structure of Search Engine Law at See See The Search: How Google and its rivals rewrote the rules of business and transformed our culture (2005) supra 7.2.1. Four Basic Information Flows

7.2.1.1. Search Engines Gather and Organise Content

7.2.1.2. Users Query the Search Engine: From 'Pull' to 'Push'

See Technology Review at See A Taxonomy of Web Search at See at See See Technology Review at See Die Macht der Suchmaschinen / The Power of Search Engines See 7.2.1.3. Search Engines Return Results

7.2.1.4. Users obtain the Content

7.2.2. Search Engine Operations and Trends

7.2.2.1. Indexing

7.2.2.2. Caching

7.2.2.3. Robot Exclusion Protocols

See at See at

7.2.2.4. From Text Snippets & Image Thumbnails to News Portals

7.2.2.5. Audio-visual search

See OUT-LAW News at See See Technology Review at See Die Macht der Suchmaschinen / The Power of Search Engines See See Forbes.com at See Technology Review at See Techcrunch at Technology Review at See SearchEngineWatch at See Technology Review at

7.3. Market Developments

7.3.1. The Centrality of Search

See at

7.3.2. The Adapting Search Engine Landscape

See Technology Review at The Register at See Financial Times at See Herald tribune at at See Technology Review at The Economist at .

See See CBS News at see See at

7.3.3. Extending Beyond Search

See Sydney Morning Herald at

7.4. Legal aspects

7.4.1. Copyright in the Search Engine Context

See See at C|NET News See IT News.com.au at Associated Press at at right of reproduction right of communication to the public

forthcoming See OJ L 167, 22.6.2001. See at See OJ L 167, . Ibid. 7.4.1.1. Right of reproduction

7.4.1.1.1 Indexing

7.4.1.1.2 Caching

See supra See Vanderbilt Law Review at Berkeley Technology Law Journal Copiepresse

7.4.1.2. Right of communication to the public

7.4.1.2.1 Indexed Information

Copiepresse

See See Copiepresse See at See ZDNet News , at Wall Street Journal at

See at Copiepresse Arriba Soft v. Kelly Perfect 10 Arriba Soft

See See Yale Journal of Law & Technology at See See at See See Ibid.

7.4.1.2.2 Cached Information

Field v Google Parker v Google

Ibid. Ibid. Ibid. See See Copiepresse

7.4.2. Trademark Law

7.4.2.1. Early Litigation and Importance of Trademark Law

7.4.2.2. Scenarios and Legal Questions

See at See ee at See OUT-LAW News at See at

Emory Law Journal Matim Li v. Crazy Line See See Arsenal v Matthew Reed

7.4.3. Data Protection Law

7.4.3.1. Increasing Data Protection Concerns

Gonzales v. Google

supra CNET News CNET News New York Times

7.4.3.2. Trends Towards Greater Personalisation

OUT-LAW News at Finanacial Times Information Technology and Management automatically

European Journal of Communication Workshop on New Technologies for Personalized Information Access Yale Journal of Law and Technology CNET News The New York Times

7.4.3.3. Data Protection Implications

ACM International Conference Proceeding Series International Review of Information Ethics

at Lewis & Clark Law Review The Digital Person Technology and Privacy in the Information Age

7.5. Policy Issues: Three Key Messages

7.5.1. Increasing Litigation in AV Search Era: Law as a Key Policy Lever

7.5.1.1. Sharp Tensions Surrounding Search Engine Operations

7.5.1.2. Unclear Legal Status

See See Online Media Daily at

7.5.1.3. The Role of Technology & Market Transactions

See at

7.5.1.4. A Matter of Degree

supra

7.5.2. Combined Effect of Laws: Need to Determine Default Liability Regime

7.5.2.1. Search Engines as Key Intermediaries

7.5.2.2. Focusing on Individual Law is Insufficient

7.5.2.2.1 Legal Regulation

economic

See supra cultural or social

7.5.2.2.2 Commercial Deals

Copiepresse See See Technology Review at See at See See The New Yorker at See The Associated Press

7.5.2.3. In Search of the Default Liability Regime

Reuters See Google Takes Hits From Youtube's Use Of Video Clips, USA Today at Reuters at See CBS News at International Herald Tribune at See C|NET News at

See Journal of Information, Law and Technology (JILT) See IRIS plus IRIS - Legal observations of the European Audiovisual Observatory at See See supra See See Sydney Morning Herald at See supra See See Copiepresse See The Information Society See , Seton Hall Public Law Research Paper No. 888327 , at Internal External Jersild

7.5.3. EU v. US: Law Impacts Innovation In AV Search

See Yale Journal of Law and Technology at See at See Jersild v. Denmark See See also See also

7.5.3.1. Copyright Law

7.5.3.2. Trademark Law

Brookfield Communications initial consumer attention Playboy Brookfield To capture initial consumer attention, even though no action is completed as a result of the confusion, may still be an infringement. Playboy supra, Geico

at See See 7.6. Conclusions

7.7. Future Research

7.7.1. Social Trends

7.7.2. Economic trends

7.7.3. Further Legal Aspects

7.7.3.1. Constitutional law

• • • • Jersild • • • •

• • •

7.7.3.2. Intellectual Property Rights

sui generis • • •

7.7.3.3. EU Competition Law

• • • • • • •

7.7.3.4. Media & Communications Law

• • •

• • • • • • • • • • • • •

7.7.3.5. Law of Obligations / Liability Law

• • •

• • • • •

ANNEX to chapter 3: Summary and goals of use cases

Research Method System Effort Name UC Action Corpora (Requirement) Product Env. User User Class Summary/Notes Industry/Community

Text Relevance Distance Project DIVAS 1 Retrieval (General) (Annotations) Metric Unspecified Unspecified Unspecified End User (Vague) Similarity to user needs Profile creation (by specifying keywords and grouping them by Annotations importance); profile editing, sharing, Project DIVAS 1 Personalization (Profiles) Unspecified Profile Manager/Editor Unspecified Unspecified End User (Vague) re-use. Enhanced syndication using profiles: format specification, Project DIVAS 1 Content Delivery Audiovisual Unspecified Targeted Delivery System Unspecified Unspecified Content Provider content protection with DRM

Dynamic building of communities Annotations sharing similar interests (classified Project DIVAS 1 Extraction/Indexing (Profiles) Unspecified Indexed Content (Profiles) Unspecified Unspecified End User (Vague) sets of peer communities) Enable the speech-to-text transcription of audio content and Text subsequent text-to-text matching of Project DIVAS 1 Extraction/Indexing (Annotations) Audio-To-Concept Index Generator (Audio) Unspecified Unspecified Content Provider lists of keywords.

Enable the use of an information provision service that informs Project DIVAS 2 Content Delivery Audiovisual Unspecified Targeted Delivery System Unspecified Unspecified Content Provider regularly a user of new information. Enable the speech-to-text transcription of audio content and Text subsequent text-to-text matching of Project DIVAS 3 Extraction/Indexing (Annotations) Audio-To-Concept Index Generator (Audio) Unspecified Unspecified Content Provider lists of keywords. Enable alignment-correct synchronization between the audio Project DIVAS 3 Analytics (Multimedia) Audiovisual Segmentation Index Generator (General) Unspecified Unspecified Content Provider of video and transcriptions Enable statistical analysis of the Statistical most frequent spoken words in Text Classification textual transcriptions of a spoken Project DIVAS 3 Analytics (Text) (Annotations) (General) Concordance Generator Unspecified Unspecified Content Provider video. Classification Systems Text Enable creation of structured, Project DIVAS 3 (General) (Concordances) Unspecified Taxonomy Generator (Topical) Unspecified Unspecified Content Provider hierarchical lists. Classification Systems Taxonomy Generator Automated word-to-word translation Project DIVAS 3 (General) Unspecified Unspecified (Interlingual) Unspecified Unspecified Content Provider between languages. Annotations Locate video from a textual Project DIVAS 4 Retrieval (Search) (Video) Unspecified Retrieval System (General) Unspecified Unspecified End User (Vague) transcription

Project DIVAS 4 Retrieval (Search) Audiovisual Query byExample Retrieval System (General) Unspecified Unspecified End User (Vague) Locate videofrom a picture Locate the original source of a Project DIVAS 4 Retrieval (Search) Audiovisual Unspecified Retrieval System (General) Unspecified Unspecified End User (Vague) video

Comparison of reference models to incoming feeds to deliver content for 3rd parties (“Picture Matching”); Query by Controlled Audience monitoring; Broadcast Project DIVAS 5 Content Delivery Audiovisual Metadata Targeted Delivery System Unspecified Unspecified Content Provider regulation; Compliance notification Encoding/analysis of multimedia Project DIVAS 5 Extraction/Indexing Audiovisual Vague Index Generator (General) Unspecified Unspecified Content Provider (reference and incoming streams) Identify and locate an originating Project DIVAS 6 Retrieval (Search) Audiovisual Query byExample Retrieval System (General) Unspecified Unspecified End User (Vague) document using an extract of it.

Project DIVAS 7 Extraction/Indexing Unspecified Unspecified Indexed Content (Vague) Unspecified Unspecified Content Provider

Search for duplicates by comparing Controlled Query by Controlled index values; output file will be Project DIVAS 7 Retrieval (Search) Metadata Metadata Retrieval System (General) Unspecified Unspecified Content Provider created with found duplicates.

Search for duplicates by comparing Controlled Query by Controlled index values; output file will be Project DIVAS 7 Retrieval (Search) Metadata Metadata Retrieval System (General) Unspecified Unspecified Content Provider created with found duplicates. to automatically create fingerprints of songs and Project DIVAS 8 Extraction/Indexing Audiovisual Vague Index Generator (General) Unspecified Unspecified End User (Vague) movies Query by Example Use available fingerprint databases Project DIVAS 8 Retrieval (Search) Audiovisual (Fingerprint) Retrieval System (General) Unspecified Unspecified End User (Vague) for audio identification Text Search by using keywords or Project RUSHES 1 Retrieval (General) (Annotations) Query by Keyword Retrieval System (General) Unspecified Unspecified End User (Vague) semantic concepts.

Annotations Query by Controlled Search by using keywords or Project RUSHES 1 Retrieval (General) (Video) Metadata Retrieval System (General) Unspecified Unspecified End User (Vague) semantic concepts.

Project RUSHES 2 Retrieval (General) Audiovisual Query yb Example Retrieval System (General) Unspecified Unspecified End User (Vague) Search by visual similarity.

Annotations Query by Controlled End User Project RUSHES 3 Retrieval (General) (Video) Metadata Retrieval System (General) Unspecified Journalist (Professional) Search by location, date and time. Journalism, Broadcasting Photo search in social network Project SAPIR 1 Retrieval (General) Image Query by Example Retrieval System (General) Open Photography End User (Simple) collections. Photography

Receipts search in social network Project SAPIR 2 Retrieval (General) Vague Query by Example Retrieval System (General) Unspecified Consumer End User (Simple) collections by multimedia queries. Consumer Discover new music using audio Project SAPIR 3 Retrieval (Browse) Audio Query by Example Retrieval System (General) Unspecified Consumer, Music End User (Simple) excerpts (mobile). Consumer, Music Retrieve information on buildings/objects depicted in tourist Project SAPIR 4 Extraction/Indexing Image Feature Detection Indexed Content (Images) Open Tourism/Heritage End User (Simple) photos (mobile). Tourism/Heritage End User (Simple, Extract information related to films Project SAPIR 5 Extraction/Indexing Audiovisual Unspecified Index Generator (General) Open Unspecified Professional) that occur on TV. End User (Simple, Search for films/video based on Project SAPIR 5 Retrieval (General) Audiovisual Query byExample Retrieval System (General) Open Unspecified Professional) extracted features.

Query by Controlled End User (Simple, Utilizes other people’s closeness to Project SAPIR 6 Retrieval (General) Image Metadata Retrieval System (General) Open Unspecified Professional) particular events to retrieve photos. Augment photo metadata using captions (i.e., toponyms) to query Annotations the web and discover extra Project TRIPOD 1 Extraction/Indexing (Image) Query by Keyword Index Generator (General) Open Unspecified Content Provider information. Create rich photo captions by identifying features using full Controlled Query by Controlled metadata (Location and Direction) Project TRIPOD 2 Extraction/Indexing Metadata Metadata Index Generator (General) Open Unspecified Content Provider to query geodatabases.

Query by Controlled Postcard recommender by profiles Project TRIPOD 3 Retrieval (Browse) Image (Vague) Metadata (Profile) Recommender System Closed Tourism/Heritage Content Provider (taste, style) Tourism/Heritage

Query by Controlled Postcard recommender by profiles Project TRIPOD 3 Personalization Image (Vague) Metadata (Profile) Indexed Content (Profiles) Closed Tourism/Heritage Content Provider (taste, style) Tourism/Heritage Automotive, End User Search for the parts designs using aAutomotive, Product Project VICTORY 1 Retrieval (General) Unspecified Vagueetrieval R System (General) Closed Designers (Professional) similarity search. Designers

Text Collaborative design through queryDesigners, Decision (Annotations) Designers, DecisionEnd User (Professional) extension and relevance feedbackMakers Project VICTORY 2 Retrieval (General) (Vague) Vague Retrieval System (General) UnspecifiedMakers Project VICTORY 3 Vague Unspecified Unspecifieded Unspecifi UnspecifiedConsumer, Consumers, Design Designers End User (Simple) Information sharing/exchange. Community Open Gaming Open Gaming Project VICTORY 4 Unspecified Vague Unspecifiedaring System Social Sh Open Communities End User (Simple) Customized figureion sharing. creat Communities

Maintenance/InstallatiEnd User Access to guideline/catalog Maintenance/Installation/S upport Personnel Project VICTORY 5 Retrieval (General) Vague Unspecifiedetrieval System R (General) Unspecifiedon/Support Personnel(Professional) repositories. Information capture and similarity Tourism/Heritage, search on objects of interest Travel End User (Simple) Project VICTORY 6 Unspecified Unspecified Unspecifiedue Vag Unspecified (mobile). Tourism/Heritage, Travel Build user profiles and personalize Project VITALAS 1,1 Extraction/Indexing Unspecifiedecified Unsp Indexed Content (Profiles) Unspecifiedst, Journalist Archivi Content Provider user access Archivist, Journalist

Query by Controlled End User Project VITALAS 1,2 Retrieval (General) Image Metadata (Profile) Retrieval System (General)fied Unspeci Archivist, Journalist(Professional) Find pictures using a search Archivist,profile. Journalist

Query by Controlled End User Find pictures relevant to a Project VITALAS 1,2 Personalization Image Metadata (Profile) Retrieval System (General)st, Journalist Archivi (Professional) journalist's interests. Archivist, Journalist

Project VITALAS 2,1 Extraction/Indexing Imaged Unspecifie Index Generator (General) Unspecified Archivist,rnalist Jou Content Provider Automated indexing Journalist Archivist,

Proximity measure: Fusion of visual Project VITALAS 2,2 Extraction/Indexing ImageConcept Visual-to- Index Generator (General) Unspecifiedied Unspecif Unspecified and textual descriptors. Statistical Classification Classification or non-supervised Project VITALAS 2,2 Extraction/Indexing Image (General) Clustering Algorithm Unspecified UnspecifiedContent Provider clustering End User (Simple, Project VITALAS 2,2 Retrieval (General) Imagepment GUI DeveloGUI N/A Unspecified Professional) Interactive visualization map

Interactive browsing based on cross Relevance Archivist, Journalist,End User (Simple, modal proximity and interactive Archivist, Journalist, Art Professional) Project VITALAS 2,3 Retrieval (General) Image Feedback Retrieval System (General) UnspecifiedArt Director relevance feedback Director Interactive browsing based on cross End User (Simple, Cross Modal Archivist, Journalist, modal proximity and interactive Archivist, Journalist, Art Professional) Project VITALAS 2,3 Retrieval (General) Image Proximity Retrieval System (General) UnspecifiedArt Director relevance feedback Director

Query by Controlled End User (Simple, Retrieve pictures by matching text queries to conceptual links. Archivist, Journalist Project VITALAS 2,4 Retrieval (General) Image Metadata Retrieval System (General) Unspecifiedist, Journalist Archiv Professional) Retrieve pictures of a well known Project VITALAS 2,4 Retrieval (General) Imagetection Feature De Retrieval System (General) Unspecified person. End User (Simple, Cross modal similarity search to Project VITALAS 3,1 Retrieval (General) Audiovisualy by Example Quer Retrieval System (General) Unspecifiedchivist, ArResearcherProfessional) find video extracts. Archivist, Researcher Establish classification, index and Project VITALAS 3,2 Extraction/Indexing Audiovisualecified Unsp Indexed Content (Audiovisual) Unspecifiedivist, ArchResearcher Content Provider retrieve sequences. Archivist, Researcher

Query by Controlled End User (Simple, Retrieval by semantic concept. (Concept Learning?) Archivist, Researcher Project VITALAS 3,3 Retrieval (General) AudiovisualMetadata Retrieval System (General) Unspecifiedist, Researcher Archiv Professional) Retrieval by semantic concept. Project VITALAS 3,3 Extraction/Indexing Audiovisualo-To-Concept Audi Index Generator (Audio) Archivist, Researcher Content Provider (Concept Learning?) Archivist, Researcher Feature Detection Retrieval by semantic concept. Project VITALAS 3,3 Extraction/Indexing Audiovisual(Face Recognition)Indexed Content (Images) Unspecified Archivist, Researcher Content Provider(Concept Learning?) Archivist, Researcher Interactive cartographic visualization for content overview of Archivist, Journalist,End User video with spatial layout with Archivist, Journalist, Project VITALAS 4,1 Retrieval (Browse) Audiovisualevelopment GUI D GUI Unspecified Researcher (Professional) constraints. Researcher

Tool development for building Indexing Manager/Editor Archivist, Journalist, homogeneous and normalized Archivist, Journalist, Project VITALAS 4,1 Retrieval (Search) Audiovisual (General) UnspecifiedResearcher Content Provider textual metadata. Researcher Archivist, Journalist, Archivist, Journalist, Project VITALAS 4,1 Extraction/Indexing Audiovisualecified Unsp Indexed Content (Audiovisual) UnspecifiedResearcher Content Provider Automatic indexing Researcher Automatic or assisted identification Archivist, Journalist, of the main line of the program Archivist, Journalist, Project VITALAS 4,1 Retrieval (Search) Audiovisualcified Unspe Retrieval System (General) UnspecifiedResearcher Content Provider (backbone) Researcher Retrieve news programs with End User similar subjects (examples can be Project VITALAS 4,2 Retrieval (General) Audiovisualy by Example Quer Retrieval System (General) Unspecifiedchivist Ar (Professional) visual, audio or textual). Archivist Audio detection of logos; Audio Project VITALAS 4,3 Extraction/Indexing Audiovisualch Recognition Spee Indexed Content (Audio) Unspecified Archivist Content Providerrecognition of spoken words. Archivist Visual object detection (logos, map, Project VITALAS 4,3 Extraction/Indexing Audiovisualure Detection Feat Indexed Content (Audiovisual) Unspecified Archivist Content Providertext incrustation). Archivist

Project VITALAS 4,3 Analytics (Multimedia) Audiovisualegmentation S Multimedia Segments Unspecified Archivistontent Provider C Localization of temporal "markers".rchivist A National Initiative iAD N/A Retrieval (General) Vague UnspecifiedRetrieval System (General) Unspecified Unspecifiedpecified Uns National Initiative iAD N/A Analytics (General) Vague UnspecifiedRetrieval System (General) Unspecified Unspecifiedpecified Uns Enterprise Search Semantic National Classification Initiative iAD N/A Extraction/Indexing Vague (General) Unspecified Unspecified Unspecifieded Unspecifi Enterprise Search National Development of new meeting Initiative IM2 N/A Retrieval (General) Audiovisualevelopment GUI D Social Sharing System Unspecified Unspecified End User (Vague) browsers. National Initiative IM2 N/A Analytics (Multimedia) Audiovisualeech Recognition Sp Multimedia Segments Unspecifiedcified Unspe End User (Vague) Automatic discourse processing

Scene analysis, speaker National segmentation and tracking, Initiative IM2 N/A Analytics (Multimedia) Audiovisual Feature Detection Multimedia Segments Unspecified Unspecified End User (Vague) vocabulary . Collection and full annotation of National large amounts (100 hours) of Initiative IM2 N/A Extraction/Indexing Audiovisual Audio-to-Concept Indexed Content (Audiovisual) Unspecified Unspecified End User (Vague) multimodal meeting recordings Collection and full annotation of National large amounts (100 hours) of Initiative IM2 N/A Extraction/Indexing Audiovisual Visual-to-Concept Indexed Content (Audiovisual) Unspecified Unspecified End User (Vague) multimodal meeting recordings National Initiative IM2 N/A Analytics (Multimedia) Audiovisual Segmentation Multimedia Segments Unspecified Unspecified EndUser (Vague) National Multimedia Initiative N N/A Retrieval (General) Audiovisual Vague Retrieval System (General) Unspecified Unspecified End User (Vague) Semantic web search National Multimedia Develop a multimedia browser Initiative N N/A Retrieval (General) Audiovisual GUI Development GUInspecified U Unspecified End User (Vague) (MediaMill). Semantic Develop a P2P system for video National Multimedia Classification Metadata Manager/Editor metadata exchange and indexing Initiative N N/A Extraction/Indexing Audiovisual (General) (General) Unspecified Unspecified End User (Vague) (StreetTivo). Develop a P2P system for video National Multimedia Metadata Manager/Editor metadata exchange and indexing Initiative N N/A Analytics (General) Audiovisual Unspecified (General) Unspecified Unspecified End User (Vague) (StreetTivo). Selectively index content by development/deployment of a spider that retrieves indexed data Companies, from website by looking in National Audiovisual End User Index Files, and stores it in a local Initiative MundoAV N/A Extraction/Indexing Audiovisual Vague Indexed Content (Audiovisual) UnspecifiedProfessionals (Professional) database. National Initiative MundoAV N/A Retrieval (General) Audiovisualspecified Un Unspecified Unspecified Unspecified Unspecified

Search multimedia, including National broadcast media, on name, context Initiative QUAERO N/A Retrieval (General) Audiovisual Unspecified Unspecified Unspecified Consumers End User (Vague) and metadata annotations. Enhance user experience of multimedia search services by National developing more convenient Initiative QUAERO N/A Retrieval (General) Audiovisual GUI Development Unspecified Unspecified Consumers End UserVague) ( interfaces.

Digital Heritage: Improve annotation National and encoding by a combination of Initiative QUAERO N/A Analytics (Multimedia) Audiovisualegmentation S Multimedia Segments Unspecified Tourism/Heritage Content Provider automatic and manual means.

Digital Heritage: Improve annotation National and encoding by a combination of Initiative QUAERO N/A Analytics (Multimedia) Audiovisualeature F Detection Indexed Content (Audiovisual) Unspecified Tourism/Heritage Content Provider automatic and manual means.

Digital Heritage: Improve annotation National and encoding by a combination of Initiative QUAERO N/A Analytics (Multimedia) Audiovisual Speech Recognition Indexed Content (Audio) Unspecified Tourism/Heritage Content Provider automatic and manual means. Digital asset management: Multimedia Ingestion, (post) National production, aggregation, storage, Initiative QUAERO N/A Retrieval (General) Audiovisualgue Va Digital Asset Manager Unspecified Broadcastingent ContProvider search and re-purposing. Platform for text/image annotation: Translation, classification and structuring of information into Semantic knowledge (including thesaurus National Classification Systems Classification construction and named entity Initiative QUAERO N/A(General) Audiovisual (General) Taxonomy Generator Unspecified Businessestent ConProvider recognition). National Query by Example Initiative QUAERO N/A Retrieval (Search) Audiovisual(Fingerprint) Retrieval System (General) Unspecifiedroadcasting B Content Provider Video and audio fingerprinting. National Initiative THESEUS N/A Extraction/Indexing Audiovisualideo segmentation V Multimedia Segments Unspecifiedecified Unsp Content Provider Video and image recognition. National Initiative THESEUS N/A Extraction/Indexing Audiovisualeature F extraction Indexed Content (Audiovisual) Unspecified Unspecified Content Provider National Semantic navigation and Initiative THESEUS N/A Retrieval (Search) Audiovisualery by Qu Example Retrieval System (General) UnspecifiedUnspecified End User (Vague) interaction.

National Controlled Query by Controlled Semantic navigation and Initiative THESEUS N/A Retrieval (Search) Metadata Metadata Retrieval System (General) Unspecified Unspecified End User (Vague) interaction. Semantic Classification Systems Controlled National Classification Controlled Vocabulary Ontology design, mapping, (General) Metadata Initiative THESEUS N/A (General) Development Unspecified Unspecified Content Providermanagement, evolution; reasoning. National Initiative THESEUS N/A Extraction/Indexing Audiovisualachine M Learning Unspecified Unspecified Content Provider Statisticalachine m learning. Semantic National Classification Systems Controlled Classification Metadata standards and Initiative THESEUS N/A(General) Metadata (General) Standards Development Unspecified Unspecified Content Provider standardization. National Initiative THESEUS N/A Personalization Audiovisualcified Unspe Unspecified Unspecified Unspecified Contentider Prov User adaptation and personalization Semantic National Classification Systems Controlled Classification Ontology/Taxonomy Initiative THESEUS N/A(General) Metadata (General) Manager/Editor Unspecified Unspecified Content rProvide Visual ontology editing framework Semantic National Classification Indexing Manager/Editor Visualization techniques for Initiative THESEUS N/A Extraction/Indexing N/A (General) (General) Unspecified Unspecified Content Providersemantic annotation

Modular GUI framework for the National visualization of semantic Initiative THESEUS N/A Retrieval (General) N/Aopment GUI Devel Vague Unspecified Unspecified End Userinformation (Vague) in Web. National Initiative THESEUS N/A Retrieval (General) N/Aopment GUI Devel GUI Unspecified Unspecified Content Provideral ontology editing Visu framework. National Visualization techniques for Initiative THESEUS N/A Extraction/Indexing N/Aopment GUI Devel GUI Unspecified Unspecified Contentsemantic Provider annotation. PROCESSUS: Integration of semantically enriched process- National chains in and across industrial Initiative THESEUS N/A Vague Unspecified Vague Vaguecified Unspecified Unspe Unspecified companies. ALEXANDRIA: Publishing platform Semantic for user generated content; Classification National combines semantics and (General) Vague Unspecified Unspecified) End User (Vague Initiative THESEUS N/A Vague Unspecified community recommendation. MEDICO: Scalable semantical National analysis of diagnostic images in Initiative THESEUS N/A Unspecified Image Vagued Unspecifie Unspecified Unspecified Content Providermedicine. CONTENTUS: Process chain for providing semantic access to AV- National archives as a part of safeguarding Initiative THESEUS N/A Unspecified Audiovisual Vagueecified Unsp Unspecified Unspecified Content theProvider national cultural heritage.

Class Project UC Primary Goal Secondary Goal Project DIVAS 1 Segment location in audiovisual database of user relevant information. Deliver new audiovisual information with respect to the interests and needs of a user in Project DIVAS 2 almost real time. Project DIVAS 3 Content description and machine-assisted indexing of audiovisual corpora. Locate audiovisual content (particularly original source video) corresponding to textual or Project DIVAS 4 visual material. Project DIVAS 5 Identify whether a video sequence has been correctly viewed by users. Rapidly browse an extended database to identify and locate the originating audiovisual Project DIVAS 6 Retrieve audiovisual content using an extract of it. document. Project DIVAS 7 Search for audiovisual duplicates. Automated delivery of updates Efficiently search and tag unknown files within private music archives and within of new releases of songs or Project DIVAS 8 compressed domains. video clips. Project RUSHES 1 Find content by searching with keywords and semantic concepts. Project RUSHES 2 Find content by visual search (similarity search) Project RUSHES 3 Find media of a certain location, date and time. Project SAPIR 1 Search for photos in social network’s photo collections. Project SAPIR 2 Search for receipts by multimedia queries on a social network. Project SAPIR 3 Discover and purchase new music on basis of a recorded audio excerpt. Project SAPIR 4 Retrieve information in situ on buildings or objects depicted on the tourist’s photos. Extract information related to films that occur on TV, and search for other films or video Project SAPIR 5 clips based on extracted features. Project SAPIR 6 Utilize other people’s closeness to particular events to retrieve a photo. Project TRIPOD 1 Augment existing photo captions. Project TRIPOD 2 Create rich photo captions using geo-metadata.

Project TRIPOD 3 Recommend postcards to send using a user's preferences. Project VICTORY 1 Search for the design of automotive parts. Project VICTORY 2 Vague Project VICTORY 3 Vague Project VICTORY 4 Share customized figures among gamers.

Project VICTORY 5 Provide access to large repositories of maintenance, installation and support documents. Project VICTORY 6 Provide information about objects of interest to travelers. Project VITALAS 1,1 Personalize users access to content. Project VITALAS 1,2 Retrieve pictures that are relevant to a user's interests. Project VITALAS 2,1 Automatically label visual concepts. Provide a graphical interactive and efficient overview of a collection of pictures (~2000 Project VITALAS 2,2 items). Project VITALAS 2,3 Provide an interactive navigation based on proximity criteria with user feedback. Project VITALAS 2,4 Retrieve pictures from textual query using conceptual links. Project VITALAS 2,5 Retrieve pictures of a well-known person. Project VITALAS 3,1 Provide a cross modal similarity search to find videos extracts. Project VITALAS 3,2 Establish audio and visual classification. Project VITALAS 3,3 Navigation of video content by concepts. Create an interactive structured map of a video program to provide a general structured Project VITALAS 4,1 overview. Project VITALAS 4,2 Find content about a subject that is similar to a news program. Project VITALAS 4,3 Localisation of audio or graphic temporal "markers" associated with a program. Build international networks to National identify and execute on global Initiative iAD N/A Research next generation precision, analytics and scale in information access. disruption opportunities. National Understand, present, and retrieve information from multimodal recordings of face-to-face Development of new meeting Initiative IM2 N/A meetings or lectures browsers. National Enable P2P video indexing and Initiative MultimediaN N/A Intuitive and easy multimedia retrieval (search and browse) using semantic concepts. metadata exchange. National Provide professionals, companies and institutions a complete, comprehensive, timely and Initiative MundoAV N/A authoritative referent of the audiovisual sector. National Initiative QUAERO N/A National Initiative THESEUS N/A