Invited Talks


 

New Challenges in Petascale Scientific Databases

by Alexander Szalay

The Johns Hopkins University

Scientific data is doubling every year. Virtual Observatories are established over every scale of the physical world: from elementary particles to materials, biological systems, environmental observatories, remote sensing, and the universe. These collaborations collect increasing amounts of data, often close to a rate of petabytes per year. Many scientists will soon obtain most of their data from large scientific repositories of data, often stored in the form of databases. The talk will discuss the different requirements for such databases, and discuss user behavior in a few concrete examples taken from astronomy, in particular from the 6 year usage of the Sloan Digital Sky Survey database. Interesting query patterns are emerging, where users create custom “crawlers” to break large queries into many repetitive ones. The trial-and-error behavior of many exploratory projects will be also discussed. The talk will also present various scalable alternatives to large scientific analysis facilities.

 

Adventures in the Blogosphere

by Nick Koudas

University of Toronto

Blogs, social networks, wikis and microblogging are proliferating at unprecedented pace. The numbers reported quantifying user engagement are profound. In this talk, I will present BlogScope (www.blogscope.net) a system under development at the University of Toronto, that aims to collect, process and distill in real time the information in social media. I will present the system, its architecture the difficulties encountered and highlight the various research challenges in building the various components of the system. I will also present, Grapevine, BlogScope's sister project that aims to make sense in real time of the social media space. I will detail areas of research related to the scope of these projects and present challenges that could be addressed via the utilization of scientific and statistical database techniques. If time permits, I'll present demos.

 

The evolution of vertical database architectures - a historical survey

by Per Svensson

Swedish Defence Research Agency

In this talk, we will survey and discuss the evolution of a certain class of database architectures, today usually called "vertical databases". These architectures have their roots in designs stemming from the early or mid -1970's, a fact rarely recognized in modern papers on the topic. Technical topics to be discussed include the evolution from 1979 to recent of storage structures, query processing and optimization techniques, and basic algorithms for single- and multivariable queries in vertical databases. These issues will be approached from the perspective and performance needs of a scientific or statistical large-scale data analyst user, demanding different design priorities from conventional business RDBMS applications.