by robo_boi
Removing Dangerous Features
On Aug 3rd an announcement was released into the Numerai Rocket.Chat with a link to a forum post written by master_key (MikeP). It was stated that a handful of features in the v3 and v4 datasets have changed in their construction and do not behave the same way today as they did during training data. It is highly recommended by the Numerai team to remove these features from your models.
There are 5 features which are particularly bad offenders:
['feature_unsustaining_chewier_adnoun', 'feature_coastal_edible_whang', 'feature_trisomic_hagiographic_fragrance', 'feature_censorial_leachier_rickshaw', 'feature_steric_coxcombic_relinquishment']
If you look at these features’ correlations with the target over time, you will see that they are very consistently negatively correlated for most of the data, but in more recent times have almost 0 correlation with the target.
Here is a cumulative sum of these features correlation with the target, to make the shift easy to see.
It is stated in the forum post:
Typically, we try to stick to a philosophy of giving the users all of the features, even if we don’t think it’s necessarily good to use them, with the hope that the users will be better at deciding which features to use, and how to use them, than we are.
However in this case, we find that the features’ behavior differences are so problematic to the point that no model should be using these features in any way.
In one simple test, we find that removing these features entirely can boost performance from a correlation of about 0.023 up to 0.025 over the validation eras, and a similar level of performance boost will continue into the future.
These 5 above are the worst offenders, but we also suggest removing 5 more features as well due to a similar reason. Below are the complete lists of features to remove from your models in the v4 and v3 datasets.
v4:
['feature_palpebral_univalve_pennoncel',
'feature_unsustaining_chewier_adnoun',
'feature_brainish_nonabsorbent_assurance',
'feature_coastal_edible_whang',
'feature_disprovable_topmost_burrower',
'feature_trisomic_hagiographic_fragrance',
'feature_queenliest_childing_ritual',
'feature_censorial_leachier_rickshaw',
'feature_daylong_ecumenic_lucina',
'feature_steric_coxcombic_relinquishment']
v3:
['feature_base_ingrain_calligrapher',
'feature_unvaried_social_bangkok',
'feature_deliberative_connatural_kinetoscope',
'feature_haziest_lifelike_horseback',
'feature_accusatory_disinfectant_deportment',
'feature_exorbitant_myeloid_crinkle',
'feature_jerkwater_eustatic_electrocardiograph',
'feature_undivorced_unsatisfying_praetorium',
'feature_direst_interrupted_paloma',
'feature_lofty_acceptable_challenge']
These features are not present in v2 data.
Numerai did state that they will not be changing the construction of the v3 and v4 datasets, because they are very sensitive to disrupting user pipelines. Future data releases will not include any of these features as well.
See the full forum post here
jrb posted a Python function to help replace the dangerous features in a pandas DataFrame with NaN
or anything else you like. It works with both v3 and v4 data.
RNumerai Rewrite Is Complete
On Aug 23rd, it was announced by omnianalytics that the RNumerai rewrite was completed. This interface allows download of tournament data, submit predictions, get user information, stake NMR and much more. Using the functions from this package end user can write R code to automate the whole procedure related to numerai tournament.
https://github.com/Omni-Analytics-Group/Rnumerai
NMR
NMR is down around -24% over the past month. BTC is down around -12% over the same time period. The U.S. Federal Reserve Chair Jerome Powell doubled down this month on staying the course with future interest rate hikes. The market is currently pricing in a 39.5% chance of a 50 basis points (bps) rate hike and a 60.5% chance of a 75 bps rate hike in September, according to CME Group’s FedWatch tool.
#decorrelateNMR
Memes of The Month
by keno
by jrb
Next CoE Sponsored Meetup Location- NYC, USA
After Richard Craib posted to Twitter in late July that he had decided to move to New York to deepen Numerai’s relationship with global capital, the CoE decided after deliberating over the past month that we will be conducting the next CoE sponsored meetup/hackathon in NYC! We currently have our eyes set on Saturday Sep 24th but we are currently working out final details. We would push into October if we could but Richard would not be available to attend during that month. We feel having him present would be amazing!
With the date approaching in less than one month, please pay close attention to the Rocket.Chat and the CoE twitter account for registration details.
NumerBay - The Community Marketplace Updates
2022-07-31 — 2022-08-14:
Minor UI improvements
2022-08-14 — 2022-08-28:
Added refund button to the purchases page to allow sending email messages to sellers
Added CORR60 metric
Fixed CD pipeline
Other bug fixes and improvements (See list of closed issues)
Compute Lite Beta Testing is Live!
The compute lite beta has been in progress for the past few weeks and the Numerai team is ready for more users to try it out.
If you'd like to join the beta, please go through this document: https://docs.google.com/document/d/1RCKgL4SAqEJ2atnMsdaPHdlV-d7pxJl9dB__mSx11CM/edit?usp=sharing
2chanes also attended the latest CoE weekly twitter space to explain Compute Lite as well as answer any questions from the Numerai community.
Please sign up!
CoE Wallet Transactions
Aug 3rd -46 NMR Numerbay
Aug 3rd -31 NMR Newsletter
Aug 14th -12 NMR Numerbay
Aug 30th -39 NMR Numerbay
Other News
The videos and slides from the London meetup are now available --> https://github.com/councilofelders/meetups
London Meetup NFTee Claim is live: For the 14 people who obtained a POAP at the London CoE meetup and hackathon event last month, your free NFTee is now ready to be claimed! Please head over to this URL https://numer.ai/nftee/coe to claim. Remember, you will need a very small amount of ETH to claim to pay gas for the transaction. Hopefully we see some new rare ones! 🔥 if you are unable to claim, most likely you did not complete the process to claim the POAP to a non-custodial wallet after entering an email.
Corr60 scores have recently been added to the leaderboards and model pages. We are still waiting on official word on if 60 day rounds is on the horizon as well as when a “daily” tournament will start.
Richard will be speaking on a panel at Future Of Finance Conference on Sept 16th in NYC
Disclaimer: This is not an official Numerai newsletter. It is sponsored by the Numerai CoE, a decentralized autonomous organization. Every effort is made to provide accurate and complete information but there is no claims, promises or guarantees about the accuracy of the contents.