[Proposal] Numerai Community Marketplace

Thanks to @arbitrage for the draft feedback.
This proposal follows the format suggested by @jrb

Anything in the below proposal is open for discussions and changes. Thank you for the feedback in advance.


1.Proposal
Tl;dr: I’d like to build an open-sourced prototype of a StockX-style marketplace for Numerai model predictions. A static mockup:

Ideation: The idea hit me during my recent GPU hunting craze on StockX. StockX is a stock-market-like online marketplace for consumer goods (sneakers, electronics, etc.) that emphasizes authenticity and price discovery. I thought its auction mechanism and item authentication mechanism can be desirable for a marketplace of Numerai predictions, too.

This project is meant to be a proof-of-concept demo showcasing an ideal user experience for both sellers and buyers of Numerai model predictions. This would lay the foundation for future contributions to make it fully production-ready. Core UX and features will be functional, but things like payments and cybersecurity won’t be the focus of this project.

Problem statement:
Like many tournament participants, I want to be able to sell my predictions to friends or strangers alike to monetize my models further. There is a lack of a go-to marketplace and people are trying various ways, such as NFTs, e-commerce platforms, generic data exchanges or even offline. This is not efficient for either party. Moreover, there are some Numerai-specific needs these methods could not accommodate, such as model ownership verification, enforcement of prediction file consistency, price discovery, intellectual property protection, automated submission and fraud protection, etc.

Solution: A community-run marketplace that will tackle the above pain points. The platform will have the following features:

a. Core features:

  1. Buy-and-sell with price discovery: In addition to the basic buy-and-sell functionalities, there is a limit order book mechanism for price discovery of model predictions. The seller puts up predictions for sale for each model each round, and has full-control over the ask-side of the order book to determine how many files to sell at each price [denoted as “File Mode”]. Alternatively, the seller can control how much NMR stake can be put on each file sold by using the number of NMRs as the transaction unit instead of the number of files, [denoted as “Stake Mode”].
    @arbitrage pointed out that model predictions are not common goods like sneakers, they are unique and hence the ask side of the order book can be redundant here. This is true, each model’s sell-side of the order book can only be managed by the single seller of this model. However, in Stake Mode this can be useful. The seller can effectively perform tiered pricing of the amount of stakes allowed by managing the sell-side order book. E.g. Sell 100 NMR allowance at $50, Sell 500 NMR allowance at $100, such that the buyer would have to progressively pay more if they want to stake more.
  2. Ownership verification: Buyers can be assured that the files belong to the seller’s model. During seller onboarding, the seller would prove ownership of models by putting up a one-time code to their model descriptions temporarily, similar to how domain name ownership verification is done. Payouts to sellers are locked-up until at least the first live score. The live score would be compared to the seller’s own submission for consistency check.
  3. Intellectual property protection: For sellers concerned about buyers abusing their files, they can opt for selling in Stake Mode. The platform will facilitate automatic submission for the buyers such that they would not have access to the raw file yet can still stake on it.
  4. Fraud protection: Each seller needs to lock up a collateral amount with the platform during onboarding. And for any model sold in Stake Mode, each buyer needs to lock some collateral amount at the time of the transaction. Transaction proceeds are withheld from the seller until live scores are checked for consistency. If inconsistent, the proceeds are returned to the buyers, and the seller will lose some collateral with the platform. For Stake Mode transactions, violations from the buyer such as over-staking (if the file was sold through automated API submission) can result in loss of collateral for the buyer.

b. Extended features (Good to have if I have the time):

  1. Support for sale of Signals Predictions
  2. Support for sale of model file/scripts
  3. Email notifications

c. Optional features (If I have the time and ability, or something for future contributors to consider):

  1. NMR for payment and collateral [I know this is essential for anything beyond demo, and will try my best to implement it]
  2. Fraud protection enforcement using Erasure
  3. Make this a dApp

Auxiliary notes:

  1. I will try to keep the code modular and well-commented for ease of future contributions from others. I have some experiences with traditional websites and web apps, but have not built any dApp. Therefore I will use a traditional backend for now. The main focus of this project is showcasing the user experiences and to serve as a starting point for future contributions.
  2. I will try to maintain a good separation of front and back-end of the app to make transition to dApp easy.
  3. Dummy payment processing first, but I will try to make real NMR payments possible if I have time after completing other features. For obvious reasons I won’t use any credit card payment processing services (but if you guys want it, I can easily support Stripe).
  4. Upon delivery of the project, I will keep this hosted for up to 3 months on Google Cloud for the community to test and gather feedback.

2.Timeline
If approved, the earliest date I can start is July 15. I will commit full-time for 4-6 weeks.

3.Best case outcome
All features are implemented and functional in a timely manner. The platform is well-received by the community and resulted in a lot of constructive feedback. An open-source team is formed to further improve the project, making it a production-grade platform that is trusted by the community and owned by the community. More users start buying and selling predictions. NMR gets another use case that benefits NMR holders and sees more adoptions. I am rewarded with a nice amount of NMRs too. The same platform can even be used for selling other Numerai stuff such as Signals data, Signals predictions, model files and scripts.

4. Worst case outcome
I lose the ability to work on this for unforeseeable reasons and nothing gets done. I will be paid nothing. In that case, anyone in the community is welcome to take over this proposal and build it. At worst we still have this thread as an ideation for how this kind of platform should be until someone decides to build it.

5.Success criteria
Success should be evaluated upon the delivery of the open-sourced code and deployment of the test platform, according to how much of the features above are implemented and are functioning properly.

6.Funding required, if any.
I don’t seek to make a ton of money out of this but need enough to keep me committed to the labor and to cover any cost, to be paid only after work delivery and evaluation. CoE can apply any deduction or addition at the end based on the evaluation of work quality. Fees can be broken down by features. The community can decide the weight of each feature. I suggest the following as a starting point:

a. Core Features (Total: 69.420 NMR)

  1. Price discovery (and basic buy-sell): 30 NMR
  2. Ownership verification / seller onboarding mechanisms: 10 NMR
  3. Intellectual property protection / automated submission: 15 NMR
  4. Fraud protection: 10 NMR
  5. Hosting cost (dev + 3 months): 4.420 NMR

b. Extension Features (Total: 30 NMR)

  1. Support for sale of Signals Predictions: 15 NMR
  2. Support for sale of model file/scripts: 15 NMR
  3. Email notifications: Free

c. Optional Features (Total: 60 NMR+?)

  1. NMR for payment and collateral: 30 NMR (labor+gas)
  2. Fraud protection enforcement using Erasure: 30 (labor+gas)
  3. Make this a dApp: ?

Additional asks:

  1. Permission from the Numerai team to create an additional account for dev and testing
  2. One contact person from CoE for continuous feedbacks during the project

Some key points for discussions:

  1. Feature scope
  2. Payment and collateral rules & mechanisms
  3. Deliverables and evaluation
  4. Funding amount and structure
  5. Anything else

EDIT:
2020-06-30 - Postponed earliest start date

10 Likes

What is your sentiment about this proposal?

  • This sounds great!
  • Not too sure about this…
  • This is ridiculous! smh…
  • But what about …? [Please leave your comments below]

0 voters

I would very much like a centralized market place for selling predictions. I think that will bring in a lot more buyers than just one-off individuals trying to sell where ever they find easiest. Like the concept a lot.

One comment to start, it seems like your concept is to lock up everyone’s funds until it can be seen that the live scores match? I think there’s a lot of value in buying predictions in order to ensemble them. That’s how people buying my NFT predictions have used them in the past. So their live scores never actually match any of mine. It might just have to be “you have one week to say you didn’t get what you expected or the funds are automatically paid out” or something along those lines.

5 Likes

@hb_scout Thanks for the first comment! That’s a very good point. What about the following? :

  • In File Mode (which would be the way of selling for the purpose of ensembling, etc.), proceeds are delivered to seller immediately, as there is no way to verify the consistency of user’s submission anyway. Exception being if buyer opts for automated submission in which case consistency check is possible.
  • In Stake Mode, since the buyer won’t have access to the raw file anyway, keep the same lock and collateral mechanism in the original proposal.

Something like this think (a market) is the best way to go – I said something along these lines in previous threads on the topic. The more the buying and selling can be a free-for-all without a bunch of commitments from either party the better it will be. I agree with @hb_scout about the ensembling – if the buyer wants to verify the predictions are the same as the seller is also submitting but doesn’t want to use them in that form, they can use a secondary unstaked slot where they can be submitted unaltered and they’ll see that they match. In any case, I think the buyer should actually get the predictions so they can mix & match them as they desire rather than have somehow submitted for them to their slots (as is sometimes suggested).

2 Likes

@wigglemuse Yes I like the ensembling use case too. The buyers will be able to get the predictions if the model is sold in File Mode. However, I think a seller should be able to choose to sell only in Stake Mode to allow buyers to stake without getting the actual file, if they have concerns like preventing model distillation, unauthorized resales, etc.

As for consistency check in File Mode, buyers have the freedom to submit the raw file in a separate slot, however the platform would not be able to enforce anything even in the case of inconsistency, unless such submission is done automatically by the platform. A solution could be the buyer provides a submission API key to the platform, the platform submits the untouched raw file to one of buyer’s designated slots for consistency checks.

Some sort of reputation system of successful sales ought to keep people honest. There really is no particular incentive to screw somebody over that I can think of. And nobody is going to buy unproven predictions without a good track record on live. If we do have bad actors trying to game the system, I don’t think they’d do that by trying to give the buyer the “wrong” predictions – what is more likely is people using up their own slots on a bunch of very different high-risk high-reward models and hope one of them rockets to the top of the leaderboard and then they can sell it for a few rounds before it tanks again. Which may or may not go badly for the person buying as it may do just fine for a while. I mean that’s actually a legit strategy if you are a gambling type. The question is will anybody buy anything that isn’t significantly staked by the owner? Part of the point of all this would be to get money flowing to people that can’t afford to stake a lot themselves, but of course a bad actor using a shotgun strategy would also stake very little and claim poverty. (I think revealing the names of all the model slots owned by sellers is probably a good idea – is that actually part of the API now?) In any case, trying to abuse such a system seems like too big a hassle for people that like to abuse such systems – there are easier marks. Spending the time to make a decent model and selling it honestly is just as easy.

4 Likes

I like the proposal and would think about both buying/selling predictions. The idea of a reputation system also makes the most sense to me. With a reputation system, this all can probably be built out as a centralized front-end to opensea listings with additional comments/reviews functionality and numerai specifics (showing rank, stake, history, etc.)

1 Like

very interested in this but would only consider participating if it’s an option to protect the predictions file so buyers don’t get them straight up. I understand others prefer to give more allowance with what can be done with their predictions, just my preference.

1 Like

@wigglemuse @jrai Reputation system is a great suggestion. Do you rather prefer reputation based on models, or customer ratings, or both? As you can see from the static mockup, the model reputation and owner’s stake are shown prominently, hopefully these help buyers to make informed decisions.

On @wigglemuse concern on gaming the system, I think a submission consistency check may help anwser buyer’s question “how do I know what I get is what I paid for?”, even though the likelihood of sellers purposefully selling wrong predictions is low. Sellers gambling for high risk models is indeed a concern too, hopefully buyers make their decisions based on how much the owner is staking. If a seller happens to gamble for high risk model yet stakes a lot themselves, then at least they burn too when their customers burn. If they claim poverty as a reason for not staking much, I don’t currently have good solutions to this, but perhaps a reputation system may help.

@liz Under current proposal, the seller can choose to either sell raw files (File Mode) or allowance to stake (Stake Mode, where raw files are never sent to the buyers). So for sellers with similar concern like yours you can choose to sell your models via the Stake Mode.

Great proposal @restrading! Thanks for the time and attention you’ve invested in think about, drafting and editing it.

I have a couple of follow up questions:

  1. Could you elaborate on the technical aspects of your proposal?
    • What tech stack?
    • Testing plan.
    • Hosting plan after the initial 3 months. Could it perhaps fund itself with a tiny platform fee?
  2. At first glance, I’m a bit concerned about the fraud protection feature. It deserves a lot of careful thought. From my understanding of what you describe, a buyer could grief the seller by submitting different (possibly better) predictions and then crying wolf.

Finally, I think payments (preferably in NMR) should be a core feature. The project wouldn’t be of much use to the community in practice, otherwise.

2 Likes

@jrb Thanks for the tech questions. These are my current thoughts:

  1. Could you elaborate on the technical aspects of your proposal?
  • What tech stack?
    React front-end, flask (centralized) back-end. I’m not trying to make this a dApp from the get-go. Arguably the front-end UX is the main focus. Test deployment will be on GCP. Have not decided on the database. I’m open to recommendations in terms of the tech stack. I’m a fast learner and like to try out new frameworks.
  • Testing plan.
    I will try to make automated unit tests for the back-end API for the key features and integration tests for key use cases. I’ll research more on the best practice. For the front-end I would manually test key use cases. I don’t have a lot of experience with formal tests, any advice regarding testing is appreciated.
  • Hosting plan after the initial 3 months. Could it perhaps fund itself with a tiny platform fee?
    If I manage to implement the payment subsystem then the hosting could be self-sustainable. However currently this is not meant to be production ready but rather a starting point.
  1. At first glance, I’m a bit concerned about the fraud protection feature. It deserves a lot of careful thought. From my understanding of what you describe, a buyer could grief the seller by submitting different (possibly better) predictions and then crying wolf.
  • Under current proposal, in situations where fraud protection applies, the platform automatically submits the file to check for consistency, not the buyers. No buyer action can cause a penalty on the seller and vice versa.

Indeed. I should’ve elaborated on the attack vector I had in mind in my earlier post. The specific scenario I had in mind was as follows:

  1. Seller offers predictions on the platform
  2. Buyer buys the predictions
  3. The platform submits the predictions on both the seller and buyer’s numerai accounts via the API.
  4. The buyer later goes on to submit different predictions (manually, or via the API).
  5. The resulting live scores are different and the seller is griefed for no fault.

The seller could also do the same, but I can’t see an incentive in the proposed scheme for the seller to exploit it.

2 Likes

@jrb Got it. Numerai’s API provides information on the val stats and datetime of submission, so it should be easy enough to check if buyer subsequently overrode the automatic submission. If overriden, no fraud protection would be in place. Did I miss anything?

Val stats can be doctored, but the timestamp of the submission should work. Although, there might be a timing attack, lurking.

1 Like

General idea for submissions: have Numerai compute a checksum value for each submission that could be checked by anyone via API so the entire submission could be validated as same or not same as something else. (This would be useful in scenarios outside of this proposal but also for this.) Applying it to ONLY the live era might be useful also.

2 Likes

The concept of selling predictions is interesting.

What are the benefits of selling predictions? How will it affect participants that don’t sell? How will it help the metamodel?

Does it re-enable the possibility of scoreboard attacks?

I assume the scoreboard is the place prospective buyers are to look for information. Unlike in earlier iterations, currently the scoreboard is unimportant. With selling, will there be incentive to game the scoreboard in order to profit from selling predictions?

A problem is the buyer is basing their decision on past performance. The seller knows how the model works but has not the confidence to risk further staking themselves.

1 Like

Or maybe just not the funds. I’d stake a lot more on my models if I had a lot more to stake – I just don’t.

1 Like

@jrb Timing attack is possible from the moment the platform starts submitting a particular file till the moment it checks for submission status, which would be less than a minute. Since it seems Numerai API also provides filename, a random string can be added to the filename by the platform. I think this would patch things up nicely?

@wigglemuse Checksum computed by Numerai may not be sufficient to mitigate this edge case of timing attack, unless the platform itself can also produce the checksum with the same method as Numerai.

@mic I don’t think this is comparable to a leaderboard attack as there is no guarantee of payout simply due to high position on the leaderboard, nor is the payment coming from Numerai. This is a free market solution. There will be people who gamble with high risk models, but if they are not staking much themselves the buyers are going to see that. Reasonable buyers are going to take owner’s stake amount into account. If they do stake a lot themselves, then there is no issue because they are confident in their models.

Well, that would be fine. Or just have Numerai keep a history of submissions (timestamp & checksum) viewable by API so it can be seen when submissions are replaced.

1 Like