My Scientific writing guidelines for the LLM era

Since the popularization of Artificial Intelligence (AI) technology in November 2022 with the launch of ChatGPT, the scientific community has embraced a rapid integration of Large Language Models (LLMs). These tools now assist with a variety of daily administrative tasks, enhance the language of your emails and documents, help with coding, and more.

Every day we use this technology, we must remember that with great power comes great responsibility. Because, how we use and deploy LLM technology will define the integrity of our scientific research in the years to come. Therefore, today I want to share a set of guiding principles for fair use. My core conviction is simple: LLM technology should enhance our work, never produce it. The human mind should remain the source of discovery; AI is now our assistant, helping us manage and sort huge amounts of information.

Below, I describe some rules we can adopt to ensure our use of LLMs respects privacy, maintains integrity, and keeps us firmly human as the one in control.

1. Keep a Human in the Loop

In every task involving AI, the most critical rule is that the human must remain the primary author. For example, when drafting a scientific manuscript, AI can help structure the document, but we should never allow it to write the draft for us. Over the last three years of exploring this technology, I have always followed these rules:

Draft your own manuscript first: Before introducing an AI tool, write your own draft. This ensures that your voice, perspective, and unique contributions are embedded in the text from the start. In the long run, using an LLM this way will enhance your scientific writing skills rather than eroding them.
No autogeneration of content or images: Never allow an LLM to generate the core content of your manuscript (specifically results and discussions) nor produce scientific images. These elements must always originate from your own data analysis and observations. If you let AI generate these sections, you are outsourcing the discovery process, which violates the Generative AI policies for most journals.
Use AI for language enhancement only: Let AI assist with grammar, flow, clarity, and citation formatting. Using AI this way is like having a spellchecker on steroids. It polishes your existing work without creating it for you.
Verify, don’t Copy & Paste: Never accept an LLM’s output blindly. Always verify facts and citations. If you copy and paste without verification, you risk hallucinations and plagiarism. Therefore, you should read every sentence aloud; if it doesn’t sound like you, rewrite it until it does.

2. Privacy first: Protect your intellectual property

Scientific data is always sensitive because it is considered intellectual property. Before feeding your manuscript into an AI tool, ask yourself: Where does this data go? and Who owns it?

To protect your work, you can adopt the following measures:

Avoid web-based free LLMs: Do not use free, public-facing chatbots like ChatGPT for sensitive research data. These services often have terms of service that allow them to scrape your input data for training purposes. Remember: if it’s free, you are the product. Feeding proprietary research into a model trained on your own future discoveries is a conflict of interest (data leakage).
Prevent model re-training: In every AI platform, you must always configure the settings to prevent your data from being fed into the provider for retraining the next model. While companies may claim privacy is always maintained, in cases of personal use, you should explicitly configure these settings to ensure compliance.
Prioritize local LLMs: Whenever possible, run models locally on your own computer (e.g., via Ollama, LM Studio). This ensures that your data never leaves your secure environment and provides absolute control over privacy. As of 2026, open-source LLM models are highly capable; the only constraint is the amount of investment you must make in hardware (above $1,000 for medium-sized local LLM models).

3. Maintain the Logs

In this new era of AI-generated content, transparency is our greatest asset. This means that when working with AI in scientific research, we must document not just our scientific methods, but also our methods of assistance.

By documenting your workflow, you prove that the AI was used to enhance rather than produce:

Document usage: Keep a logbook of every prompt used, every suggestion accepted, and every revision made by the AI. The documentation can be structured into three stages: 1) Your draft, 2) AI-enhanced version, 3) Your review (accepting or declining changes).
Logs for future revisions: Keeping these logs helps ensure the full reproducibility of your drafting process in the future. If the process is ever questioned by reviewers or editors, you can demonstrate exactly how the AI tool was utilized.

Conclusion

Adopting these practices does not mean rejecting new AI technology; rather, it means we should use it wisely. By keeping humans in the loop, protecting data privacy, defining fair use boundaries, and logging our actions, we ensure that AI assists us in making new scientific discoveries yet to come.

What are your thoughts on these guidelines? How do you currently integrate LLMs into your workflow? Let me know in the comments below!