Many ML projects, including Kaggy competitions, have a similar workflow.Start with a simple pipe with a reference model.
Then begin to incorporate improvements: add resources, increase the data, adjust the model ... in each iteration, evaluate its solution and maintain changes that improve the destination metric.

This workflow implies doing many experiments.Over time, it is difficult to keep up with progress and positive changes.
Instead of working in new ideas, you spend time thinking:
- "Have I tried this?"
- "What was the value of hyperparmeter that worked so well last week?"
You end up performing the same material several times.If you are not tracking your experiments yet, I recommend you begin!
In my previous Kaggy projects, I used to trust spreadsheets for monitoring.The configuration and administration of leaf sheets with experiment require a lot of additional work.I got tired of manually filling the model parameters and performance values after each experiment and really wanted to change to an automated solution.
[Neptune.ai] allowed me to save a lot of time and concentrate on modeling decisions, which helped me win three medals in Kaggy competitions.
It was then that I discovered Neptune.ai.This tool allowed me to save a lot of time and concentrate on modeling decisions, which helped me win three medals in Kaggy competitions.
In this publication, I will share my story to change the spreadsheets to Neptune toExperiment monitoringI will describe some disadvantages of the spreadsheets, I will explain how Neptune helps to solve them and give some tips on the use of Neptune for Kaggy.
What's wrong with the spreadsheets for the follow -up of the experiments?
The spreadsheets are excellent for many purposes.For the follow -up of the experiments, you can simply configure a spreadsheet with different columns that contain the relevant parameters and the performance of your pipe.It is also easy to share this spreadsheet with teammates.

He looks great, right?
Unfortunately, there are some problems with that.
Handwork
After doing this for a while, you will notice thatMaintaining a spreadsheet starts eating a long time.You must manually fill a metadata line for each new experiment and add a column for each new parameter.It will be out of control as soon as its pipeline becomes more sophisticated.
It is also veryEasy to make a typing error, which can lead to bad decisions.
When working in a deep learning competition, I entered incorrectly in a learning rate in one of my experiments.I realized that there was a typification error and poor performance comes from a low learning rate.This cost me two days of work invested in the wrong addressbased on a false conclusion.
No live tracking
With spreadsheets, youYou must wait until an experiment is completed to register performance.
In addition to being frustrated to make it manually, it does not allow you to compare the intermediate results in experiments, which is useful to see if a new breed seems promising.
Obviously, you can log in to the model performance after each season, but do it manually for each experiment requires even more time and effort.
Annex limitations
Another problem with spreadsheets is thatAdmits only textual metadata that can be inserted into a cell.
And if you want to attach other metadata such as:
- Weighs of models,
- Source code,
- Graphics with model forecasts,
- Input data version?
YouYou need to manually store these things in the folders of your project.Outside the spreadsheet.
In practice, it is difficult to organize and synchronize experiments between local machines, Google Colab, Kaggy Notebooks and other environments that your teammates can use.There are these metitorated monitoring spreadsheets seems useful, but it is very difficult to do so.
Change spreadsheets to Neptune
A few months ago, our team was working on aYuca folk disease competitionAnd used Google spreadsheets to follow up the experiments.A month after the challenge, our spreadsheet was already confused:
- Some races lacked performance because one of us forgot to log in and no longer had the results.
- PDF with loss curves have dispersed in the Google Drive and Kaggy notebooks.
- It is possible that some parameters have inserted incorrectly, but it was too long to restore and verify the oldest script versions.
It was difficult to make good decisions oriented to data based on our spreadsheet.
Although only four weeks left, we decided to move to Neptune.WasSurprised to see the little effort he really took to configure it.In short, there are three main steps:
- Subscribe to a Neptune account and create a project,
- Install the neptune package in your environment,
- Include several lines in the pipe to allow the extraction of relevant metatria.
You can read more about the exact steps to start using Neptunehere.Obviously, go through the documentation and become familiar with the platform can take a few hours.But remember that this is just a unique investment.After learning the tool once, I could automate much of the monitoring and trust in the New Kaggy Compets nurse with very little extra effort.
Know but
See documents to see how you canOrganize ML experiments in Neptuneor jump to aExample of the Projectand explore the application (without registration).
What is so good in Neptune?

Less manual work
One of the main advantages of Neptune in spreadsheets is thatThis saves a lot of manual work.Com Neptune, use the API within the Oillet forAutomatically warning and stores metadatawhile the code is running.
matterNeptune.NovoasNeptunerun = neptine.inter (project = neptine.'#', api_token ='#')# Your credentials# Relevant parametersconfig = {"Lot Size":64, So,"Learning rate":0,001, So,"Optimizer":"Adam"}run["Parameters"] = config# Track the training process by recording your training metricsforeraEmrange(100): run["Train/Precision"] .Log (time *0,6)# Record the final resultsrun["F1_Score"] =0,66
You don't need to manually put it in the results table, and you tooSave to make a typing error.Since the goal dates are sent to Neptune directly from the code, it will reach all numbers, regardless of how many digits they have.
... The saved time registering each experiment accumulates very quickly and leads to tangible gains ... This offers the opportunity to focus better on modeling decisions.
(Video) Neptune Technology Group Factory Tour Video
It may seem short, but the time saved in each experiment accumulates very quickly and leads to tangible profits until the end of the project.This offers you the opportunity to not think about the real monitoring process and focus better on modeling decisions.In a way, it is like hiring an assistant to take care of some boring (but very useful) registration tasks so that it can concentrate more on creative work.
Live tracking
What I like about Neptune is thatAllows you to track liveIf you work with models such as neural networks or a greater gradient that requires many iterations before convergence, you know that it is very useful to analyze the dynamics of losses in advance to detect problems and compare models.
The monitoring of intermediate results on a spreadsheet is very frustrating.You can start comparing learning curves while your experiment is still running.
... Many ML experiments have negative results ... Using the Neptune panel to compare intermediate deliveries with the first performance values may be enough to realize that you need to interrupt the experiment and change something.
This proves to be very useful.As you can expect, many ML experiments have negative results (sorry, but this great idea that works for a few days decreases precision).
This is completely good because this is how ML works.
What is not good is that it is possible that you should wait a long time to obtain this negative sign of its channeling.Using the Neptune panel to compare intermediate graphics with the first performance values can be sufficient to realize that you need to interrupt the experiment and change something.
Attach outputs
Another advantage of Neptune is the ability toAttached virtually anything to each experiment executed.This really helps maintain important results, such as pesos of models and predictions in one place, and easily accessing them in the experiments table.
That isparticularly useful if you and your colleagues work in different environmentsAnd they need to load the outputs to synchronize the files.
I also like the ability to attach the code: source of each execution to make sure you have the version of the notebook that produced the corresponding result.This can be very useful if you want to reverse some changes that have not improved performance and want to return to the best previous version.
Tips to improve Kaggy's performance with Neptune
When working in Kaggy competitions, there are some tips that I can give more to improve your follow -up experience.
Use Neptuno en Kaggle Google Colab Notebooks
First,Neptune is very useful for working at Kaggy Notebooks or Google ColabWho have session time limits when using GPU/TPU?I cannot say how many times I lost all the exits of the experiment due to a notebook accident when the training was just a few minutes that the limit of 9 hours allowed!
To avoid this, I recommend creating Neptune, so that the pesos of the model and the loss metrics are stored after every time.In this way, you will always have a loaded verification point for Neptune servers to resume your training, even that your Kaggy Notebook is necessary.You will also have the opportunity to compare your intermediate results before the session failure with other experiments to judge your potential.
Verify the documents
Discover how you canExecute your model training on Google Colab and continue -or with Neptune.
Update of executions with the score of the Kaggy Classification Table
Secondly, an important metric for follow -up in Kaggy Projects is the Table.com Neptune classification score, you can track its validation score automatically, but it is not possibleIt requires that it requires it.Send forecasts on the Kaggy website.
The most convenient way to add the score of its experiments classification table to the Neptune monitoring table is to use the"Recino Execution" Functionality.It allows you to update any finished experiment with a new metric with some lines of code.This feature is also useful for resuming blocked sessions, which we discuss in the previous paragraph.
matterNeptune.NovoasNeptinerun = neptine.inth (project = 'your-kagle-project', rum ="Sun-123")# Add a new metricExecute ["lb_score"] =0,5# Instant download of model weightsModel = execute ["Trem/model_weights"] .descgar ()# Continuous work
METHO-CORSE EXPERIMENT
Finally, I know that many Kaggyrs like to perform complex analysis of their presentations, such as estimating the correlation between CV and LB scores or drawing the best dynamic of time.
Although it is not yet feasible to do these things on the site, Neptune allows himDownload meta -ity of all experiments directly in your notebook Using oneAPI Call.This facilitates the deepest immersion in the results or export of the metadata table and shares externally with people who use a different monitoring tool or do not trust any experiments.
matterNeptune.Novoasneptunemy_project = neptune.get_project ('Your work space/your Kaggy project')# Get the panel with races contributed by 'Sophia'Sophia_df = My_project.Fec.Fec.FETch_Tableable (Propporteur ='Sofia') .to_pandas () sophia_df.head ()
Final thoughts
In this publication, I shared my story to change the spreadsheets to Neptune to track ML experiments and emphasized some advantages of Neptune.I would like to emphasize once again than the investment of time in infrastructure tools, whether experiments, code version or anything else.Always a good decision and will probably be rewarded with increasing productivity.
Metadized follow -up experiment with spreadsheets is much better than not tracking.This will help you see your progress better, to understand what modifications improve your solution and help make modeling decisions.Doing this with spreadsheets will also cost some additional time and effort..Tools like Neptune lead to tracking the experiment at a next levelAllow automating the metadata record and focusing on modeling decisions.
I hope you find my story useful.Good luck with your future ML projects!