I hereby propose the development of a community-based compute solution which should be intended to be used by users which for whatever reason (technical / cloud-dependencies / … ) can’t make use of the (newly developed) Numerai Compute solution which will be AWS-based.
Keep in mind at the moment of writing the new Numerai Compute from the team is still being beta-tested and since the proposal has some possible dependencies on their solution things can change in the next weeks. I do want to put this proposal out in the open to get a feel whether or not people are actually interested in this, in which case I will follow through. I wanted to put this proposal already out during the London CoE meetup unfortunately I got sick and couldn’t join there I did gave the slides to @ia_ai and he put them online, thanks for that.
1.Proposal
I’d like to build an open-source community compute solution for Numerai model predictions.
Ideation: There has already been lots of talk lately about moving towards a daily compute solution by the team, and currently they are busy with beta-testing this and from what I heard from 2chan this looks to be a clean simple solution (as in one-line commands). They did state though that initially it will still be a AWS-only solution which for a part of the tournament users will still be an issue for various reasons. I was already playing around for quite some time with different local and cloud compute solutions, hunting for the most simple/reliable and cost-effective solution and it seems now more than ever to get a general solution in place which takes away the difficulties of managing infrastructure whether on-premise of cloud-based.
Problem statement:
Like a lot of tournament participants, I want to be able to easily create and manage my compute solutions used for daily or weekly predictions. I want to be able to quickly set it up and also be able to quickly make changes, scale up or down if needed and also get notified upon any problems. It should be a one-line command based solution (e.g. python) which should be in line with or an extension at the Numerai Compute solution.
Solution: An open-source software solution that will perform the mentioned compute, with the option to run it on various cloud providers (azure/oracle/cloud) or as a local compute (windows with wsl2, vm/docker, dedicated hardware) based on Ubuntu. Initially it would support python-based prediction pipelines.The basis functionalities should be the following:
- Easy setup of Compute Engine: It should be possible to provision/deploy a compute solution based on user input (cloud or local, scaling, location, …)
- Managing of Compute Engine: It should be possible to manage the compute engine remotely, that is updating pipelines / changing starting times / changing notifications / …
- Monitoring and Logging: The solution should provide the user with notifications upon events, e.g. when an error occurred.
- IP: The solution should run on the users own local compute or cloud, that is they have full control of their compute and no other user can access it (or the data).
2.Timeline
If approved, the earliest start date would be end of August, beginning of September. Initially we should start with an MVP, for example Oracle compute based on Ubuntu and then extend the solution from there.
3.Best case outcome
A working solution which works seamlessly with the Numerai Compute solution, and is used by a large part of the community. The open-source solution is easily managed by other developers and can be extended with other functionalities, e.g. automatic uploading to Numerbay. The Numerai team sees the added-value of the solution and will work together with the community developers in maintaining the solution.
4. Worst case outcome
The solution has not been adopted by the community, at which case all efforts would have been in vain.
5.Success criteria
This should be defined together with the CoE and community of what the success criteria should be, TBD.
6.Funding required, if any.
I did do voluntary work before, I also created some examples on how to do compute for Azure and Oracle however they are still meant to be used by experienced users. In order to create a solution which can be used by a large part of the community it will take more effort in terms of software development, testing and documentation. Initially I cannot put an exact number on it but when it comes to delivering an MVP this should be possible in several weeks (ballparking it). When the idea at least is approved I will make a detailed planning including expected work hours. At least this is too much work do to this on voluntary basis again like before lol.
Points for discussion:
- Minimum Features that should be in the MVP
- Long-term / Short-term goal
- Expected deliverables
- Anything else
Well lets first see what people think of it before I add more stuff to this already long story