André Lizardo
Subscribe
Sign in
Home
LinkedIn
Github
Research
About
I built a native macOS notes app for search-first, plain-text people
For anyone hunting an nvALT / Notational Velocity–style workflow—without Electron, and with one IDE trick I missed.
Mar 25
•
André Lizardo
October 2025
Notes on "Evaluating Large Language Models Trained on Code"
Codex is the model behind Github Copilot. My notes are not focused on Codex but on HumanEval, a functional correctness dataset designed as the primary…
Oct 12, 2025
•
André Lizardo
1
September 2025
Notes on "SWE-BENCH: Can language models resolve real-world Github issues"
SWE-bench is a benchmark that evaluates Large Language Models to solve real-world Github issues written in Python.
Sep 21, 2025
•
André Lizardo
1
2
Notes on "COFFE: A Code Efficiency Benchmark for Code Generation"
COFFE is a code efficiency benchmark that evaluates the correctness and the time efficiency using CPU instruction counts.
Sep 14, 2025
•
André Lizardo
1
Notes on "The Illusion of Thinking"
The Apple Machine Learning Research team released a paper in June 2025 that questions whether LRMs actually think and if such models can reason when the…
Sep 2, 2025
•
André Lizardo
3
2
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts