Tesla dataset shows either safety increase or decrease, depending on data cleaning!

"NHTSA took air bag deployments before and after Autosteer installation to estimate the number of crashes per million miles. But most of the cars reported by Tesla were missing the miles the car traveled before Autosteer was installed. With no miles at all to add to the equation, but the same number of air bag deployments, any findings would inflate the crash rate for pre-Autosteer cars"


As one of my coworkers used to say, "data science is 90% data cleaning". Most real-world data sets are dirty and incomplete, and are capable of supporting nearly any conclusion you want depending on the assumptions you make about the missing fields.

@nindokag Teaching is 90% tweaking of past year questions, readapting past materials for a new syllabus, etc. I can see where they intersect.

Sign in to participate in the conversation
Refactor Camp

Mastodon instance for attendees of Refactor Camp, and members of various online/offline groups that have grown out of it. Related local groups with varying levels of activity exist in the Bay Area, New York, Chicago, and Austin. Kinda/sorta sponsored by the Ribbonfarm Blogamatic Universe. If you already know a few people in this neck of the woods, try and pick a handle they'll recognize when you sign up. Please note that the registration confirmation email may end up in your spam folder, so check there. It should come from administrator Zach Faddis.