6 Branches

conf

d4e927bd6a

evaluating with beir

4 months ago

data

892dcc235a

extracting readmes with elixir

4 months ago

dependency_parsing

93530dbac0

parsing Python AST in Rust

9 months ago

docs

59ab7dbb3c

graph similarity model for label which also is a graph

1 year ago

ghs_ex

892dcc235a

extracting readmes with elixir

4 months ago

github_search

37820bef8a

code2doc - return filenames, BEIR notebook

1 week ago

inputs

afd04ee803

postprocessing in generation pipeline

5 months ago

notebooks

4a6ee424ab

README generation report

5 days ago

org

175df3ab85

code2documentation - generating READMEs from code

2 weeks ago

scripts

aa42f940fb

Python repository imports

3 years ago

streamlit

86d3f60d8a

heroku

3 years ago

test

0dcfcee20a

generating READMEs with code2doc - pipeline

1 week ago

.dvcignore

a5b52709f0

project cleanup, working poetry

5 months ago

.gitignore

a5b52709f0

project cleanup, working poetry

5 months ago

HOWTO.org

175df3ab85

code2documentation - generating READMEs from code

2 weeks ago

README.md

7ca0a14e31

README, ploomber config refactor

2 years ago

configure.sh

4f85d29009

training with readmes

1 year ago

download_data.sh

dc442194e5

data download from GCP

4 years ago

env.yaml

892dcc235a

extracting readmes with elixir

4 months ago

experiment.py

daa81ccce2

refactoring preprocessing, tests

4 years ago

guild.yml

50a4b5910e

guild changes

4 years ago

pipeline.yaml

0dcfcee20a

generating READMEs with code2doc - pipeline

1 week ago

prepare_data.py

6bdd909614

moved some functions from notebooks to packagages

4 years ago

pyproject.toml

0dcfcee20a

generating READMEs with code2doc - pipeline

1 week ago

requirements.txt

a5b52709f0

project cleanup, working poetry

5 months ago

settings.ini

77414a025b

pipeline for fetching import data

2 years ago

setup.py

8f01f1bfc2

nbdev setup

3 years ago

DagsHub Storage

You have to be logged in to leave a comment.

github-search

Repository for "Searching Github Python repositories with machine learning" masters thesis

Jakub Bartczuk

Running this project

Prerequisites

Preprocessing steps were tested on a machine with 64GB RAM.

For training Graph Neural Networks CUDA GPU is required.

General remarks

The project uses nbdev to create Python files from Jupyter notebook. To "make" project run

nbdev_build_lib; pip install -e .

in the root directory.

We use ploomber for managing training and data preprocessing.

For example to create csv files with extracted READMEs run

ploomber build --partial make_readmes --skip-upstream --force

Relevant definitions can be found in pipeline.yaml and env.yaml

TODO Downloading data

TODO Model checkpoints

Training models

Ploomber step:

run_gnn_experiment

TODO Using models

Tip!

Press p or to see the previous file or, n or to see the next file

README.md

github-search

Repository for "Searching Github Python repositories with machine learning" masters thesis

Jakub Bartczuk

Running this project

Prerequisites

General remarks

TODO Downloading data

TODO Model checkpoints

Training models

TODO Using models

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

lambdaofgod / github_search mirror of https://github.com/lambdaofgod/github_search

README.md

github-search

Repository for "Searching Github Python repositories with machine learning" masters thesis

Jakub Bartczuk

Running this project

Prerequisites

General remarks

TODO Downloading data

TODO Model checkpoints

Training models

TODO Using models

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

lambdaofgod
/
github_search
mirror of https://github.com/lambdaofgod/github_search