Image

Get the conference Poster!

txtFull (5.1 MB)

txtMedium (2.0 MB)

List of Seminars PDF Print E-mail
Written by Pablo Guerrero   
Monday, 08 October 2007



Seminar #1:
Exploring the Power of Links in Scalable Data Analysis

Jiawei Han, University of Illinois, Urbana-Champaign; Xiaoxin Yin, Google and Philip Yu, IBM T. J. Watson Research Center
Duration: 1.5 hours
Tuesday Apr 8th, 11.15-12.45
Location: Uxmal
Algorithms like PageRank and HITS have been developed in late 1990s to explore links among Web pages to discover authoritative pages and hubs. Links have also been popularly used in citation analysis and social network analysis. However, there is a lack of systematic treatment on how to fully explore the power of links in scalable data analysis. Besides a survey of the recent research work on link analysis and link mining, we also show that the power of links can be explored thoroughly to improve the effectiveness and efficiency of typical data analysis tasks, including classification, clustering, information integration, and other interesting data mining tasks, especially in the multi-relational databases and/or the World-Wide Web environments. Some recent results that explore the crucial information hidden in links will be introduced, including (1) multi-relational data mining, (2) user-guided multi-relational clustering, (3) scalable methods for link-based cluster analysis, (4) information integration and object distinction analysis, and (5) link analysis for graph mining and information network mining. The power of links for other analysis tasks will also be discussed in the tutorial.
Image Jiawei Han, Professor, Department of Computer Science, University of Illinois at Urbana-Champaign. He has been working on research into data mining, data warehousing, database systems, data mining from spatiotemporal data, multimedia data, stream and RFID data, Web data, social network data, and biological data, with over 350 journal and conference publications. He has chaired or served on over 100 program committees of international conferences and workshops, including PC co-chair for KDD, SDM, and ICDM conferences, vice chair for ICDE and ICDM conferences, and Americas Coordinator for a VLDB conference. He is also serving as the founding Editor-In-Chief of ACM Transactions on Knowledge Discovery from Data. He is an ACM Fellow and has received 2004 ACM SIGKDD Innovations Award and 2005 IEEE Computer Society Technical Achievement Award. His book "Data Mining: Concepts and Techniques" (2nd ed., Morgan Kaufmann, 2006) has been popularly used as a textbook worldwide.
Image Xiaoxin Yin is an applied researcher in Microsoft Research. He received Ph.D. in Computer Science from University of Illinois at Urbana-Champaign in May 2007. His research interests include data mining, link analysis, clustering, classification, similarity analysis, and multi-relational data mining. Xiaoxin Yin served as the information director for the ACM Transactions on Knowledge Discovery from Data in 2005-2007, and a reviewer for many journals and conferences, including IEEE Transactions on Knowledge and Data Engineering, Data Mining and Knowledge Discovery, SIGMOD, KDD, ICDE, and VLDB conferences.


Image Philip S. Yu received B.S. Degree in E.E. from National Taiwan University, M.S. and Ph.D. degrees in E.E. from Stanford University, and M.B.A. degree from New York University. He is with the IBM T.J. Watson Research Center and currently manager of the Software Tools and Techniques group. Dr. Yu has published more than 500 papers in refereed journals and conferences. He holds or has applied for more than 300 US patents.
Dr. Yu is a Fellow of the ACM and of the IEEE. He is associate editors of ACM Transactions on the Internet Technology and ACM Transactions on Knowledge Discovery from Data. He is on the steering committee of IEEE Conference on Data Mining. He was the Editor-in-Chief of IEEE Transactions on Knowledge and Data Engineering (2001-2004). Dr. Yu has received several IBM honors including 2 IBM Outstanding Innovation Awards, an Outstanding Technical Achievement Award, 2 Research Division Awards and the 92nd plateau of Invention Achievement Awards. He received a Research Contributions Award from IEEE Intl. Conference on Data Mining in 2003 and also an IEEE Region 1 Award for "promoting and perpetuating numerous new electrical engineering concepts" in 1999. Dr. Yu is an IBM Master Inventor.


Seminar #2:
Mobile and Embedded Database Systems and Technology

Anil Nori, Microsoft Corp.
Duration: 3 hours
Tuesday Apr 8th, 14.00-15.30 and 16.00-17.30
Location: Uxmal
Recent advances in processors, memory, storage, and connectivity have paved the way for next generation applications that are data-driven, whose data can reside anywhere (i.e. on the server, desktop, devices, embedded in applications) and that support access from anywhere (i.e. local, remote, over the network, from PDAs, in connected and disconnected fashion). Memory sizes have gone up and prices have come down significantly; with 64 bit addressability, it is not uncommon to configure servers with 8 – 16GB of memory, and desktops with 2 – 4 GBs of memory. With advances in Flash memory technology, large flash drives are available at reasonable prices. Computers with 32 GB flash drives are making way into the market. Flash drives not only eliminate seek time and rotational latency they consume significantly less power than the conventional disk drives, making them ideal for portable devices. All these trends lead way to applications that are data-centric, distributed, mobile, and embedded.

In this tutorial we will cover the following topics in detail: flash technologies and their applications; device trends and technologies; mobile and embedded applications and their requirements; mobile and embedded DBMS architectures; embedded DBMSs like Berkeley DB, Sybase iAnywhere, SQL Server Compact, TinyDB etc; future trends and challenges.
Image Anil Nori is a Distinguished Engineer at Microsoft. He is an architect in the SQL Server organization, focusing on data and application platform technologies. He is part of the senior leadership overseeing the overall vision, strategy, and architecture for the Microsoft data platforms. He has over 25 years of experience in building complex database and application systems. Before coming to Microsoft, Anil was the CTO of Asera, which he co-founded with Vinod Khosla of Kleiner Perkins. Asera pioneered Composite Applications and the composite application platform. Prior to Asera, Anil was at Oracle as an Architect for the Oracle database system, where he was responsible for Oracle object-relational and extensible technology, Internet and multi-media DBMS development, and XML technology. Before joining Oracle, Anil was a Database Architect for DEC Rdb database products, where he was involved in the development of centralized and distributed DBMS products. Prior to DEC, Anil was a Computer Scientist at Computer Corporation of America, a leader in Database Research. Anil is active and well known in the database and applications community.


Seminar #3:
Stream Processing: Going Beyond Database Management Systems

Sharma Chakravarthy, University of Texas, Arlington
Duration: 3 hours
Wednesday Apr 9th, 9.00-10.30 and 11.00-12.30
Location: Uxmal
Currently, a large class of data-intensive applications in which data is in the form of continuous streams has been widely recognized. Furthermore, these applications have to respond in a timely manner. In other words, these applications have specific Quality of Service (QoS) requirements for query processing. In this tutorial, we discuss main challenges, approaches, techniques, and solutions for developing a general-purpose data stream management system (or DSMS) and present our work in this area as well as the work in the literature. We present work on Aurora, Stream, Fjord/Telegraph, and MavStream (to name a few) covering the major efforts in data stream management systems.
We will cover the following topics in detail during the tutorial: differences between traditional query processing in a DBMS and continuous query processing, operator and query modeling for stream processing, scheduling strategies (for conserving memory and reducing tuple latency), capacity estimation (to determine strategies and to determine when and how much load to shed), and load shedding strategies. The emphasis will be on satisfying QoS requirements as it is extremely important for stream processing applications. Implementation of a stream processing system will also be covered using the implementation of MavStream at UTA. Finally, the need for the integration of stream and complex event processing will be outlined.

Sharma Chakravarthy is Professor of Computer and Engineering Department at The University of Texas at Arlington, Texas. He established the Information Technology Laboratory at UT Arlington in Jan 2000 and currently heads it. He is the recipient of the college level “Excellence in Research” award in 2006, university level “Creative Outstanding Researcher” award in 2003 and the department level senior outstanding researcher award in 2002. His current research includes web technologies, stream data processing, information integration, mining and knowledge discovery – association, graph and text, active databases, distributed and heterogeneous databases, query optimization, and multi-media databases. He has published over 120 papers in refereed international journals and conference proceedings. He has given tutorial on a number of database topics, such as graph mining, stream processing, database mining, active, real-time, distributed, object-oriented, and heterogeneous databases in North America, Europe, and Asia. He is listed in Who's Who among South Asian Americans and Who's Who among America's Teachers.
Prior to joining UTA, he was with the University of Florida, Gainesville. Prior to that, he worked as a Computer Scientist at the Computer Corporation of America (CCA) and as a Member, Technical Staff at Xerox Advanced Information Technology, Cambridge, MA. Sharma Chakravarthy received the B.E. degree in Electrical Engineering from the Indian Institute of Science, Bangalore and M.Tech from IIT Bombay, India. He worked at TIFR (Tata Institute of Fundamental Research), Bombay, India for a few years. He received M.S. and Ph.D degrees from the University of Maryland in College park in 1981 and 1985, respectively.


Seminar #4:
The Java Persistence API (JPA): Technology, Standards, and Implementations

Patrick Linskey, BEA Systems, Inc.
Duration: 3 hours
Wednesday Apr 9th, 14:00-15:30 and 16:00-17:30
Location: Uxmal
The tutorial starts with a high level overview of the Java Persistence API (JPA), mainly focusing on JPA version 1.0. We will do a brief survey of the landscape of JPA implementations, and discuss upcoming features in the JPA 2.0 specification, currently in development. Thereafter, we will examine the steps required to build a persistent domain model model and a related service that uses JPA. Two deployment scenarios are considered: deployment as an EJB 3 stateless session bean and as part of a J2SE application. The last half of the tutorial examines JPA's support for relationships, detachment, attachment, and queries. Questions are welcomed throughout.

Patrick has been involved in object/relational mapping for 6+ years. As the founder and CTO of SolarMetric, Patrick drove the technical direction of the company and oversaw the development of Kodo. Now at BEA, he leads the EJB team in designing and implementation of the WebLogic Server EJB solution. Patrick is one of the leaders on the EJB3 and the JDO specification teams, and is BEA's representative on the EJB3 expert group. Patrick is involved in several industry consortia, serving as a luminary on JDOcentral and as the moderator on forthcoming JavaPersistence.com. He has been the face of standards-based persistence, having evangelized JDO and EJB Persistence in hundreds of talks throughout the world. Patrick is co-author of Bitter EJB, and is on the JAOO Conference Program Committee. Patrick has also worked for TechTrader, MIT's Media Lab and Bank One in various technical roles. Under Patrick's leadership, Kodo has become the market leading JDO implementation with over 450 customers throughout the world spanning all industries.


Seminar #5:
Data and Metadata Alignment: Concepts and Techniques

Lise Getoor, University of Maryland and Renee Miller, University of Toronto
Duration: 3 hours
Thursday Apr 10th, 14.00-15.30 and 16.00-17.30
Location: Uxmal
Alignment is the act of adjusting or aligning the parts of a device in relation to each other. Information alignment is the process of finding, modeling, and using the correspondences or connections that place information artifacts in relation to each other.  Alignment forms the basis of many information integration, sharing, and management tasks ranging from data integration and exchange to data cleaning, record linkage, and deduplication. In many cases, there is no single optimal alignment, the best alignment is task and context-specific.
The way in which alignment is performed can also be quite different depending on the task and context. For example, in aligning ontologies the primary tools used are conceptual modeling techniques together with logical inference. In aligning data objects, statistical inference is used. For both tasks, inference may be augmented with techniques from natural language processing or other reasoning based on information theory, semantics, or a variety of task-specific principles.
This tutorial will provide an introduction to the basics of alignment as used in common information management tasks.  We will give a taxonomy of tasks, and discuss how alignment is exploited in each. Our classification includes data alignment, metadata alignment, and new unified approaches which combine both. A primary goal of our tutorial will be to identify commonalities and differences in the way alignment has been formalized and used in different environments and communities.

Lise Getoor is an assistant professor in the Computer Science Department at the University of Maryland, College Park. She received her PhD from Stanford University in 2001. Her current work includes research on link mining, statistical relational learning and representing uncertainty in structured and semi-structured data. She has published numerous articles in machine learning, data mining, database, and AI forums.  She is member of AAAI Executive council, is on the editorial board of the Machine Learning Journal, is a JAIR associate editor and has served on a variety of program committees including AAAI, ICML, IJCAI, KDD, SIGMOD, UAI, VLDB, and WWW.

Renee J. Miller is a professor of computer science and the Bell Chair of Information Systems at the University of Toronto. She received the US Presidential Early Career Award for Scientists and Engineers (PECASE), the highest honor bestowed by the United States government on outstanding scientists and engineers beginning their careers. She received an NSF CAREER Award, the Premier's Research Excellence Award, and an IBM Faculty Award.  Her research interests are in the efficient, effective use of large volumes of complex, heterogeneous data. This interest spans data integration and exchange, inconsistent and uncertain data management, and knowledge curation. She serves on the Board of Trustees of the VLDB Endowment, was a member of and chaired the ACM Kanellakis Awards committee, and served as PC co-chair of VLDB in 2004.  She received her PhD in Computer Science from the University of Wisconsin, Madison and bachelor's degrees in Mathematics and Cognitive Science from MIT.


Seminar #6:
Performance Evaluation in Database Research: Principles and Experience

Ioana Manolescu, INRIA Futurs and Stefan Manegold, CWI
Duration: 3 hours
Thursday Apr 10th, 14.00-15.30 and 16.00-17.30
Location: Coba
A significant part of today database research focuses on improving performance of a specific system. Quantitative experiments are the best way to validate such results. However, performing experiments is not always easy. Besides the complexity of the system under test, designing an experiment, chosing the right environment and parameter values, analyzing the data which is gathered, and reporting it to a third party in an expressive and intelligible way is hard.
In this tutorial, we present a general roadmap to the above steps, based on classical measure taking theory, as well as our own experience. The tutorial is primarily aimed at MS and PhD students seeking to improve their experiment practices, but more senior attendants may also find it interesting.
The tutorial will also devote a short time (~15 minutes) to tips and tricks on how to organize and present code that performs experiments, so that an outsider can repeat them.

Ioana Manolescu is a researcher in the Gemo group of INRIA Futurs, in France. Her research work is centered around XML data management, distributed data management systems, and Web application modeling. She is a founder of two SIGMOD-affiliated workshops, XIMEP on XQuery processing, and EXPDB on experimental evaluation in database research.

Stefan Manegold is a researcher in the database group at CWI in Amsterdam, The Netherlands. His research work comprises database architectures and data management on modern hardware as well as database-supported XML processing, with a particular interest in performance and benchmarking. He is co-founder of the DaMoN workshop series (co-located with SIGMOD since 2005) and co-chair of ExpDB 2007.
Last Updated ( Wednesday, 23 January 2008 )
 
< Prev   Next >