Crowdsourcing is the practice of collecting data directly from humans. This is useful for projects where human expertise is needed. Challenges in this space include creating worker environments that are pleasant to use and that respect our crowd workers’ skill, asking questions in the most cost-effective ways, selecting the most informative questions to ask workers that maximize the information gain per cost, and finding/eliminating spam to ensure high quality without throwing away legitimate work.

When using Mechanical Turk, it is very important to respect the workers’ time and expertise. we suggest teammates should read and follow the We are Dynamo Guidelines for Academic Requesters.

Related topics


  • Concept Embeddings with SNaCK This paper presents our work on “SNaCK,” a low-dimensional concept embedding algorithm that combines human expertise with automatic machine similarity kernels. Both parts are complimentary: human insight can capture relationships that are not apparent from the object’s visual similarity and the machine can help relieve the human from having to exhaustively specify many constraints.
  • Cost-effective HITs for Relative Similarity Comparisons Recently in machine learning, there has been a growing interest in collecting human similarity comparisons of the form “Is a more similar to b than to c?” Each answer provides a constraint, $d(a, b) < d(a, c)$, for some perceptual distance function $d$. In computer vision, perceptually-based embeddings can be constructed from many of these ...