| | Address matching using a fault tolerant trie (robinlinacre.com) |
| 3 points by RobinL 15 days ago | past |
|
| | An Interactive Introduction to Probabilistic Data Linkage/Deduplication (robinlinacre.com) |
| 1 point by RobinL 3 months ago | past |
|
| | Building accurate postal address matching systems (robinlinacre.com) |
| 4 points by fanf2 3 months ago | past |
|
| | Building Accurate Address Matching Systems (robinlinacre.com) |
| 13 points by Bogdanp 3 months ago | past | 2 comments |
|
| | Building Accurate Address Matching Systems (robinlinacre.com) |
| 2 points by RobinL 3 months ago | past |
|
| | Putting Scaffolding Around Vibe Coding to Build More Complex Apps (robinlinacre.com) |
| 3 points by RobinL 4 months ago | past |
|
| | Why DuckDB is my first choice for data processing (robinlinacre.com) |
| 3 points by RobinL 6 months ago | past |
|
| | AI probably won't replace me in 2025 (robinlinacre.com) |
| 3 points by RobinL 9 months ago | past | 2 comments |
|
| | The emerging impact of LLMs on my productivity (robinlinacre.com) |
| 3 points by sebg 10 months ago | past |
|
| | Super-fast deduplication of large datasets using Splink and DuckDB (robinlinacre.com) |
| 4 points by chuckhend on Jan 20, 2024 | past |
|
| | Super-fast deduplication of large datasets using Splink and DuckDB (robinlinacre.com) |
| 5 points by RobinL on Jan 19, 2024 | past |
|
| | Why Probabilistic Linkage Is More Accurate Than Fuzzy Matching for Data Deduping (robinlinacre.com) |
| 4 points by RobinL on Jan 7, 2024 | past |
|
| | An Interactive Introduction to Probabilistic Data Linkage/Deduplication (robinlinacre.com) |
| 3 points by RobinL on Oct 29, 2023 | past |
|
| | Why parquet files are my preferred API for bulk open data (robinlinacre.com) |
| 3 points by RobinL on Sept 15, 2023 | past |
|
| | SQL should be the default choice for data transformation logic (robinlinacre.com) |
| 434 points by RobinL on Jan 30, 2023 | past | 267 comments |
|
| | Why parquet files are my preferred API for bulk open data (robinlinacre.com) |
| 2 points by kristianp on Jan 10, 2023 | past |
|
| | Demystifying Apache Arrow (2020) (robinlinacre.com) |
| 197 points by dmlorenzetti on Jan 9, 2023 | past | 47 comments |
|
| | Why parquet files are my preferred API for bulk open data (robinlinacre.com) |
| 8 points by RobinL on Jan 9, 2023 | past |
|
| | Demystifying Apache Arrow (robinlinacre.com) |
| 1 point by dmlorenzetti on Oct 27, 2021 | past |
|
| | The downfall of command and control data leadership (robinlinacre.com) |
| 2 points by RobinL on Nov 8, 2020 | past |
|