[Proposal] Bounty for high quality data science posts

  • Problem: Data science posts are scarce, because people are protecting their ideas
  • Proposal: Monthly bounty for the highest rated data science posts to incentivize sharing
  • Details: On a predefined schedule, the CoE will pay bounty for the best data science posts. “Like” count is a straightforward metric, but better recommendations are welcome.
  • Timeline: This could happen every month
  • Best-case outcome: The community receives a large number of high-quality data science posts and discussions, and the metamodel improves as a result
  • Worst-case outcome: People hide their ideas. People go for Like hunting.
  • Funding: 10 NMR / month

At Kaggle I saw a good deal of discussions on various data science topics. Most of them were specific enough to spark the idea and others can try it out, but generic enough so that others can’t copy it easily.

I understand that people are afraid of sharing ideas, because they want to protect their MMC.
However sharing generic ideas and methods do help others and improve the metamodel, while others are not going be able to copy the exact same code.
I’ve already shared many of my idea, and keep sharing. I would like to encourage others to share as well.

25 Likes

Would love to see more quality sharing. However the metric (like count) can easily be manipulated. It may not be a big problem if it is not the only metric. One suggestion would be to have CoE reviews / rating / filter in addition to the raw like count.

Manual review by one member of the CoE would be best. But obviously it takes time.
Like count + a quick validity check would be something in the middle.

6 Likes

Bumping it up!
Did this proposal got any attention from CoE?

5 Likes

Hey, we are bumping this up. Im going trough the github board and cleaning things up a bit. Want to get this back out and discussed. As the writer of the newsletter, I’m thinking this could be a good idea to use that platform to create community data science posts and they could be 1) sent out using the newsletter email list in place AND/OR 2) create a new tab in the Numerbay site for posting these articles. With the big projects @richai has set up recently as well as some lengthy discussions in the rocketchat/forms, this could be a great place to have some community members write some articles for small bounties(maybe something like 250-500$ per article?) Thoughts everyone?

4 Likes

There are certainly ideas that are worth $500 or even more. And it’s great if people are encouraged to share them here.

However just an other simple guide on a common technique is not worth even half of it. Medium and Kaggle forums are full with them for free. They are free, because that is their fair value :slight_smile:

How are you going to judge, if an article is worth $500?
Any idea of decision criteria?

Very true. Im sort of thinking along the lines of very complex topics. For example something like the projects @richardcraib has discussed recently (yes I am aware that the bounty for his projects is coming from the Numerai side) Also, topics that could be more narrowed down specifically to Numerai data and Signals topics etc. that you would not find anywhere else. What were some of the topics you were thinking when writing this up 6 months ago if you remember off the top of your head?

As for payment, we have always settled for a blanket 50$ an hour rate converted to NMR at the time projects are completed so I can see these exceeding 500$ especially if a lot of time is involved writing which could be an issue but at the same time, the right topics and articles could be worth it as well. What do you think?

Strong solutions to Richard’s projects are worth probably $10k+
Nobody is going to share proven strong ideas without great rewards. But it’s helpful to have open discussion about various topic and their application on the Numerai dataset. These could spark good ideas in great minds.

I deliberately suggested 10NMR max per month, because it’s inherently hard to judge actual value of any shared idea. Some seemingly simple ideas could turn out to have great value if applied properly. And because something is super complicated it could still be junk.

I was thinking more about social recognition and encouragement of sharing ideas that may or may not work. Money is just a part of the game.

Like count is an easy measure of value.
Yes it can be abused, but if that 10 NMR is shared among the top 3 topics monthly, it’s not going to cause any problem or break the bank. Nor is it big enough sum to encourage any kind of abuse.

3 Likes

Strong solutions to Richard’s projects are worth probably $10k+

Very true!

A lot to think about here. Hopefully we get some more comments/thoughts

1 Like

As a new user of numerai, I think this would be extremely helpful. Having high quality data science posts would certainly help with onboarding, encouraging and retaining more users.

A prime example of this from my experience as a new user is @studym8’s Numerai youtube tutorial series. Decreasing the learning curve and ramping up expertise is paramount. How else will we get to the 100,000 data scientists contributing to numerai?

1 Like

Data science posts are scarce.

For me, the reasons are:

  1. I’m protecting my ideas
  2. My poor English :grin:
1 Like

@studym8’s youtube tutorial series was funded separately but any participant can submit high quality data science posts in the data science channel of the forums on advanced topics surrounding the tournament eg TC and FNCv3 studies etc. The CoE can payout, at their discretion, bounties for work above and beyond what would be considered a more normal post.

1 Like

Fantastic, thanks for letting me know of this possibility @aventurine

would be nice the CoE can make transparent what do you guys do to consider the merit of any DS related post, what are the quantitative and qualitative metrics that you might end up using, and how regular you actually do that? :slight_smile:

1 Like

I think you should be allowed to post high quality content in your native language, or at least a language which you are comfortable with and are also known by some competent people in this space - on top of my head, I think Japanese and Chinese(simplified/traditional) should be ok, probably the same with russian, latin and germanic language - the rest can be left to machine translation :slight_smile:

3 Likes

They are free, because that is their fair value

I get it but I’d like to disagree. I have two options every time I write something on medium. Either a paywall that will restrict new users from reading my posts or keep them open for all for free. I pretty much share all my ideas along with Colab notebooks in each blog. I go with second option always. I learned most for free, this is my way of contributing back regardless of like counts.

3 Likes