May 10, 2022
Financially, poor data quality costs organizations some ludicrous amounts of money. Worse, poor data quality is a strong inhibitor to the success of data science: No analytical method can create value from poor quality data. As a consequence, data science projects invest a majority of their resources on cleansing data. However, cleansing resists automation as […]
Read moreJune 20, 2019
The vision of natural language interfaces to databases (NLIDBs) is to make data stores more accessible for a wide range of non-tech savvy end users with the ultimate goal to talk to a database (almost) like to a human. While initially the database community focused on relational databases, there is currently a renaissance of building […]
Read moreAugust 21, 2018
Overview of DEEM 2018 The ACM SIGMOD Second Workshop on Data Management for End-to-End Machine Learning (DEEM) was successfully held last June in Houston, TX. The goal of DEEM is to bring together researchers and practitioners at the intersection of applied machine learning (ML) and data management/systems research to discuss data management/systems issues in ML […]
Read moreJune 25, 2018
Information visualization is an essential tool in the arsenal of a data scientist: visualizations help identify trends and patterns, spot outliers and anomalies, and verify hypotheses. Moreover, visualizations are visceral and intuitive: they tell us stories about our data; they educate, delight, inform, enthrall, amaze, and clarify. This has led to the overwhelming popularity of […]
Read moreFebruary 14, 2018
The web is an ever-evolving source of information, with data and knowledge derived from it powering a great range of modern applications. Accompanying the huge wealth of information, web data also introduces numerous challenges due to its size, diversity, volatility, inaccuracy, and contradictions. This year’s WebDB 2018 theme emphasizes the challenges and opportunities that arise […]
Read more