Pawsome Updates December '22
  Back to blog home

Pawsome Updates December '22

Active Learning Dec 08, 2022

We are excited to bring you our very first monthly update from your DagsHub DevRel team! The goal is simple, to keep you up to date about the latest projects and news from DagsHub in a concise monthly newsletter. It is much easier than backtracking through Twitter or LinkedIn  😉 Each month, we will cover any competitions/challenges we hosted or sponsored, DagsHub Learning Webinars, the MLOPs podcast, and we’ll highlight blogs featuring DagsHub that talented tech writers published.

This month you will find:

🚀 Launch of DDA

🤩 Build an Active Learning Pipeline

📄 Reproducibility Challenge - Fall ‘22

🎥 DagsHub Learning

🎙MLOPS Podcast Julia Language in Production

🎃 Hacktoberfest ‘22

DDA Launch

The start of November brought the excitement of something we have been pouring everything into - DagsHub Direct Data Access (DDA). Data management has its challenges. DVC solves it by version-controlling our data alongside code files tracked by Git. Although it’s a great start, we still lack the ability to access our data on a file granularity. This led us to build a solution - DDA - giving data scientists an easy-to-use addition to our client and API that provides an instinctive way to manage, collaborate and scale their work.

DDA provides an intuitive interface to stream and upload data for any ML project and does not require any adaptation. As any proud parent, I can go on. There is so much more to dive into, so check out the DDA Launch Blog.

Build and End-2-End Active Learning Pipeline with Yono Mittlefehldt

Active Learning is a subset of the Data-Centric AI approach, focused on iteratively improving the data and annotations at hand. In the era of big data, Active Learning is a game changer as it allows us to get significant improvements in our model’s performances while using much less data - saving us time and money.

However, building an active learning pipeline is a difficult task and used to be reserved for big corporations and companies with expansive MLOps support. But NOT anymore. Yono Mittlefehldt, our talented ML engineer, managed to build a full Active Learning pipeline in his Squirrel Detector project - using only open-source and free tools. The results are nothing less than 🧠 blowing!

You can follow along and learn how to implement it in your own project!

Build an End-2-End Active Learning Pipeline: Part 1

Build an End-2-End Active Learning Pipeline: Part 2

Reproducibility Challenge

This month, we announced our sponsorship of the Papers with Code Reproducibility challenge. This marks the third time, DagsHub supports participants taking on the challenge and having a lasting impact on machine learning reproducibility. There is still plenty of time to accept the test as the deadline for submissions is in February. Find more info on the official sponsorship announcement: ML Reproducibility Challenge - Fall 2022

DagsHub Learning - Data Versioning and Streaming with DVC and DagsHub

Nir closed out the month bringing you the latest DagsHub Learning webinar - Version and Stream Data with DVC and DagsHub.

DVC has become one of the most widely adopted open-source tools for data and model versioning. DagsHub is integrated with DVC, enabling you to connect a remote DVC storage to DagsHub, or use our built-in storage, and view DVC-versioned files alongside your code files, with the ability to diff, communicate over and merge them.

The webinar covered DVC and DDA fundamentals, how they work, and core commands. But it wouldn’t be a DagsHub Webinar without some hands-on experience - so Nir showed how to version a project with Git and DVC and then stream the data using DDA.

Whether you missed it, or want to go back and practice again here are the resources you will need:

MLOps Podcast with Logan Kilpatrick - Julia Language in Production

On the newest episode of the MLOps Podcast, Dean Pleban sits down with Logan Kilpatrick, Julia Language Developer Community Advocate (and the new Open AI DevRel! - Congrats!). Logan shares how his experience working with NASA led him to discover Julia and build his career from there. Dean and Logan also discuss the never-ending debate of Julia vs. Python, new scientific discoveries, making authentic connections, and the future of machine learning.

Hacktoberfest

We launched three challenges for this year’s Hacktoberfest! Our users rose to the challenge and contributed to Papers with Everything, 3d Datasets, and audio datasets. For the 3d challenge, we added 3D data catalog capabilities to DagsHub! You can now upload 3D Models or motion clips to DagsHub and see, move and diff it! We had a total of 37 PRs on all three challenges - if you still want to help the ML community we’re still open for contributions.

Happy Holidays 🕎🎄

As we close out the year, we want to extend a big thank you for all your contributions to the machine learning field. If it was by way of DagsHub repos, challenges, attending our webinars, or any other way, together, we will push things further in 2023. Regardless of what holiday you celebrate, we hope you enjoy this time with your loved ones. Happy Holidays from the DagsHub Team!

Tags

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.