-  📖 Implementing academ.ai, a local retrieval system for academic papersSemantic and hybrid retrieval for academic papers - look across >100,000 papers and find the relevant ones without losing your mind with keyword-based search. Curious? Read more 
-  📖 Implementing academ.ai, a local retrieval system for academic papersSemantic and hybrid retrieval for academic papers - look across >100,000 papers and find the relevant ones without losing your mind with keyword-based search. Curious? Read more 
-  📝 Auto-METRICS - a proof of conceptAutomatic assessment of methodological quality in radiomics research Recently, I received my first (obvious) LLM peer review. It was quite blatant. What’s worse: it wasn’t good - at all! Funnily enough, I had been working on something related: Auto-METRICS, a tool for automatic standardised assessment of scientific research quality in radiomics research using the METRICS framework. To show its utility, we make use of two unique, recent datasets on reproducibility in radiomics studies - Akinci D’Antonoli et al. (2025) and Kocak (2025). Together, they feature really good set of METRICS ratings - for different levels of expertise and training - for more than 50 publications. This allowed us to systematically compare human and LLM raters. The main takeaways: - Human raters agree with LLMs at the same rate that they agree with other human raters ✅
- Prompt iterations: clarifying radiomics guidelines can lead to better agreement with human raters. However these improvements were quite limited! 📈
- Too nice: LLM ratings tended to be slightly higher than those offered by human raters 😇
 I tested our tool - Auto-METRICS - here (all you need is a free Google Gemini API key) and found it really helpful to get an initial assessment for METRICS which I can easily confirm. The key? Enhance, don’t replace - having good initial ratings was super helpful in getting a final, human-based classification. Curious? Read more about Auto-METRICS at medRxiv. 
-  📖 Ferramenta para analisar sondagensFerramenta para analisar sondagens. Trabalhei nisto com um intuito muito prático - estava farto de ver as análises paupérrimas das sondagens que eram feitas na imprensa nacional. Curious? Read more