Super-Martingale
Super-Martingale OP t1_ishqkx4 wrote
Reply to comment by Null-value0 in [D] Suggestions for large-scale company name standardization? by Super-Martingale
Yes, most companies in our list are in the US, but the majority of them are privately held companies. Does tools like informatica / tibco charge tons of money?
Super-Martingale OP t1_isgey5g wrote
Reply to comment by hjmb in [D] Suggestions for large-scale company name standardization? by Super-Martingale
There is definitely a tradeoff between accuracy and efficiency. We are not sure which approach would be better, so want to keep the discussion broad.
Super-Martingale OP t1_isgacv9 wrote
Reply to comment by hjmb in [D] Suggestions for large-scale company name standardization? by Super-Martingale
In the past, I did fuzzy matching plus a manual selection for smaller lists like a few thousand strings. But for millions of rows, this is just impossible. So we are wondering whether AI-based approaches can help.
Super-Martingale OP t1_isk4gcc wrote
Reply to comment by CremeEmotional6561 in [D] Suggestions for large-scale company name standardization? by Super-Martingale
Where can we get "alias names" for the universe of US companies?