Page 3 – ACM SIGMOD Blog

Boris Glavic

January 24, 2023

Why Uncertainty is Unavoidable and What We Can Do About That

Uncertainty arises naturally in many application domains due to measurement errors, human error in data entry or transformation, missing data and bias in data collection, and many other reasons. When uncertainty is ignored during data preprocessing and analysis, this leads to hard to trace errors which can have severe real world implications such as false incarcerations […]

karimaechihabi

December 29, 2022

Similarity Search for Scalable Data Science: The Past, Present and Exciting Road Ahead

Data Science

Similarity search is a fundamental building block for a myriad of critical data science applications involving large collections of high-dimensional objects, including data discovery, data cleaning, information retrieval, classification, outlier detection and clustering. Similarity search finds objects in a collection close to a given query according to some definition of sameness. This challenging problem has […]

Senjuti Basu Roy

October 26, 2022

Returning Top-K : Preference Aggregation or Sortition, or is there a Better Middle Ground?

Recommendations

Given a large number of users’ preferences (numerical or ordinal scores, ranked order) over a large number of objects, returning top-k results entails selecting a small list/set containing exactly k objects that are most “appropriate “. In this article, I will investigate two alternatives for selecting a top-k list/set that consumes such preference based inputs. […]

Zhifeng Bao

September 17, 2022

Managing and Exploiting Massive Geolocation Data

Big Data, Spatial

The sheer volume, variety, and velocity of data in this modern era have enabled significant advancements in many research areas. However, the advancements in the research community thanks to Big Data do not necessarily translate to the benefit of society; of ordinary people living ordinary lives. There is indeed a gap between breakthroughs in the […]

Themis Palpanas

August 25, 2022

A Brief History of Data Series Indexing: from Time Series to High-Dimensional Vectors and Deep Neural Network Embeddings

Data Series

In this post, we motivate the need for efficient and effective solutions for data series similarity search, and we briefly present the work that has been done in this direction by the data series community. We also discuss the relationship to high-dimensional (high-d) vectors and deep neural network embeddings, point to the relevant efforts in […]

Boris Glavic

Why Uncertainty is Unavoidable and What We Can Do About That

karimaechihabi

Similarity Search for Scalable Data Science: The Past, Present and Exciting Road Ahead

Senjuti Basu Roy

Returning Top-K : Preference Aggregation or Sortition, or is there a Better Middle Ground?

Zhifeng Bao

Managing and Exploiting Massive Geolocation Data

Themis Palpanas

A Brief History of Data Series Indexing: from Time Series to High-Dimensional Vectors and Deep Neural Network Embeddings

Categories

Recent Comments

Archives