Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  dvc git github
Nathan Cooper 67d0778695
Libest notebook for Information Science Paper
3 years ago
861f1d392f
Initial commit
4 years ago
72b13c1c8d
doc2vec model in bechmarking
4 years ago
12b64d9370
Update README.md
3 years ago
4c60e8aae4
auto generated docs
3 years ago
05d5601d97
Self-generated library by applying ds4se
3 years ago
nbs
67d0778695
Libest notebook for Information Science Paper
3 years ago
723942a755
Create README.md.txt
4 years ago
29cf8cdd46
Update README.md
4 years ago
861f1d392f
Initial commit
4 years ago
861f1d392f
Initial commit
4 years ago
861f1d392f
Initial commit
4 years ago
861f1d392f
Initial commit
4 years ago
47c5bd0643
Update library modules
4 years ago
abb49e8d79
Update README.md
3 years ago
4cc4c53853
updated requirements
4 years ago
5aec1e38ba
Update settings to be more consistent with project and updated start.sh to use unused port
4 years ago
861f1d392f
Initial commit
4 years ago
5aec1e38ba
Update settings to be more consistent with project and updated start.sh to use unused port
4 years ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

ds4se

Data Science for Software Engieering (ds4se) is an academic initiative to perform exploratory analysis on software engineering artifacts and metadata. Data Management, Analysis, and Benchmarking for DL and Traceability.

A Data Science for Software Engineering Library (DS4SE-API)

Project Leads: Nathan, @danaderp

Description: Software data comprises any type of artifacts like source code, requirements, user stories, screens, binaries, etc. Automating software engineering tasks based on Machine Learning requires a huge effort of adapting algorithms and deep learning approaches for processing software data. SEMERU Lab is working on a solution for processing any type of data that is a product of software lifecycle. DS4SE library was coined to manage, describe, explore, infer, visualize, represent, and mine software data by relying on statistical theory and machine learning libraries. The DS4SE architecture follows the paradigm of “exploratory programming” to enhance the development process. However, most of the modules that compose the library are incomplete, incommunicated, or undocumented. In this project, we need a motivated team that will help us to connect, refactor, and implement several data science components critical for the future research in SEMERU Lab. You will be working on the back-end. The team is going to be divided into 3 domains:

  • Back-End Development and Refactoring,
  • Interface and Facade Implementation (or API), and
  • Testing.

Project Description for CSCI 435/535

Project Goals:

  • Implement the Initial Data Analysis module based on SE metrics theory
  • Refactor the Exploratory Data Analysis module based on information science theory
  • Integrate from other repositories (i.e. COMET) data science components like causal inference and data representation
  • Expose the API to be consumed by other teams (Team of Project#1 should consume your services)

Requirements:

  • Required Knowledge Prerequisites: Python and Git
  • Preferred Knowledge Prerequisites: Machine Learning, Statistical Computing
  • Exploratory Programming with Nbdev link
  • Manage your Data Science Project Structure in Early Stage Blog

Install

pip install ds4se

How to use

Fill me in please! Don't forget code examples:

1+1
2
Tip!

Press p or to see the previous file or, n or to see the next file

About

Data Science for Software Engieering (ds4se) is an academic initiative to perform exploratory analysis on software engineering artifacts and metadata. Data Management, Analysis, and Benchmarking for DL and Traceability.

Collaborators 1

Comments

Loading...