There are those who believe that creating massive amounts of high-quality machine learning training data is a job best left to small teams of very expensive data scientists. For them, we offer Best Practices for DIY Data, a guide to training data in-house.
Lunch and Label is a popular tactic among the Training Data DIYers. Data Scientists offer to provide lunch in the hopes that colleagues will show up, eat, and annotate images with bounding boxes.
Of course, the menu is crucial to the success of a Lunch and Label. Veteran DIYers take into consideration all of the following:
1. What you serve is critically important.
You want people to show up and keep coming back, because experienced Lunch and Labelers are more productive.
Unfortunately, most people quickly discover they don’t like drawing polygons during their lunch hour. And no one will do it twice for pizza.
In fact, don’t be surprised if Lunch and Label becomes your second biggest project outlay after a Data Scientist payroll.
2. Let nothing get in the way of productivity.
Data labeling is digital. As in, people use their digits to do it.
Messy sandwiches, tacos, and anything else that requires two hands to eat is a non-starter. You want people labeling while they eat, not before and after.
3. Eliminate distractions.
It goes without saying that you don’t want anyone’s attention to wander, especially after they’ve figured out that data labeling is not a whole lot of fun.
Follow the lead of savvy Lunch and Label hosts: announce at the beginning that for legal reasons there can be no chance that project data will be copied and confiscate everyone’s personal devices.
4. Alcohol is a two-edged sword.
Sure, advertising that top-shelf bourbon will be served could bring in more people. It’s likely the only way you’ll get anyone to attend a second Lunch and Label.
But lest we forget, training data accuracy is a mighty high priority when it comes to ML algorithms, particularly if they’ll be powering your next Uber ride.
If getting deep into menu planning wasn’t why you went to grad school, maybe it’s time to offload your training data challenges.
We take over the entire process of preparing ML training data. You give us your data and your requirements. We give you back very accurate datasets, at whatever scale you require.
Drop us a line if you want lunch to be the way it used to be.