Bibliography Management Guide
Reference management with multi-file bibliography system, deduplication, and hash-based caching.
Table of Contents
- Overview
- Quick Start
- Single File Workflow
- Multi-File Workflow
- Deduplication
- Hash-Based Caching
- Manual Merging
- Demo Files
- Citation Styles
- Troubleshooting
Overview
Bibliography management system supporting:
- Single-file workflow: Single
.bibfile - Multi-file workflow: References organized by topic
- Deduplication: Duplicate detection by DOI or title+year
- Hash-based caching: Skips merge when files unchanged
- Automatic merge: Runs during compilation
- Citation style configuration: Via YAML config files
Location: 00_shared/bib_files/
Quick Start
Option 1: Use Demo Files
# Compile manuscript (auto-merges 3 demo files)
./scripts/shell/compile_manuscript.sh
# Expected output:
# [00:00:00] Starting: Bibliography Merge
# Merging 3 bibliography files...
# - field_background.bib
# - methods_refs.bib
# - my_papers.bib
# ✓ Merged bibliography saved: 00_shared/bib_files/bibliography.bib
# Input entries: 13
# Unique entries: 12
# Duplicates removed: 1
Option 2: Single File
# Delete demo files
rm 00_shared/bib_files/{methods_refs,field_background,my_papers}.bib
# Edit single file
vim 00_shared/bib_files/bibliography.bib
# Compile normally
./scripts/shell/compile_manuscript.sh
Option 3: Custom Multi-File Organization
cd 00_shared/bib_files/
# Create topic-based files
vim deep_learning.bib
vim neuroimaging.bib
vim my_publications.bib
# Compile (auto-merges)
cd ../..
./scripts/shell/compile_manuscript.sh
Single File Workflow
When to use:
- Small projects (<50 references)
- Simple organization needs
- Quick papers or letters
Setup:
# Delete any multi-file organization
rm 00_shared/bib_files/*.bib
# Keep only bibliography.bib
# Edit directly
vim 00_shared/bib_files/bibliography.bib
Structure:
00_shared/bib_files/
└── bibliography.bib # All references here
No merging needed - compilation uses bibliography.bib directly.
Multi-File Workflow
When to use:
- Large projects (50+ references)
- Collaborative writing
- Topic-based organization
- Reusing reference sets across projects
Recommended Organization:
00_shared/bib_files/
├── bibliography.bib # Auto-generated (DO NOT EDIT)
├── .bibliography_cache.json # Cache (auto-managed)
├── methods_refs.bib # Methods and techniques
├── field_background.bib # Field overview papers
├── my_papers.bib # Your publications
├── deep_learning.bib # Deep learning references
└── neuroimaging.bib # Neuroimaging studies
Workflow:
-
Add references to appropriate topic file:
bash vim 00_shared/bib_files/methods_refs.bib -
Compile (automatic merge):
bash ./scripts/shell/compile_manuscript.sh -
Check merged output:
bash cat 00_shared/bib_files/bibliography.bib
Important:
- DO NOT manually edit bibliography.bib (it's auto-generated)
- Edit only the source .bib files
- bibliography.bib is regenerated on each compilation
Deduplication
Merge system removes duplicate entries using two-tier strategy:
Deduplication Strategy
1. By DOI (most reliable):
@article{Smith2020_A,
doi = {10.1038/s41598-020-12345-6},
author = {Smith, J. and Doe, J.},
title = {Neural Processing},
year = {2020}
}
@article{Smith2020_B,
doi = {10.1038/s41598-020-12345-6}, # Same DOI
author = {Smith, John and Doe, Jane},
title = {Neural Processing Methods}, # Different title
year = {2020}
}
# → Merged into single entry with combined metadata
2. By Title + Year (fallback):
@article{Chen2021_A,
author = {Chen, W.},
title = {Machine Learning for Neural Data},
year = {2021}
}
@article{Chen2021_B,
author = {Chen, Wei and Zhang, L.},
title = {Machine Learning for Neural Data}, # Same normalized title
year = {2021} # Same year
}
# → Merged (no DOI available, using title+year)
Metadata Merging
When duplicates are found, metadata is merged:
# Entry 1 (less complete)
@article{Smith2020,
author = {Smith, J.},
title = {Neural Processing},
year = {2020}
}
# Entry 2 (more complete)
@article{Smith2020_dup,
author = {Smith, John and Doe, Jane},
title = {Neural Processing},
journal = {Nature},
year = {2020},
doi = {10.1038/nature12345},
abstract = {Full abstract here...}
}
# Merged result (best of both)
@article{Smith2020,
author = {Smith, John and Doe, Jane}, # Longer version preferred
title = {Neural Processing},
journal = {Nature},
year = {2020},
doi = {10.1038/nature12345},
abstract = {Full abstract here...}
}
Rules:
- Prefers entries with more fields
- Prefers longer/more detailed field values
- Keeps all unique information
- First entry's citation key is used
Hash-Based Caching
The system uses MD5 hashing to detect changes and skip unnecessary merges.
How It Works
-
First compilation:
bash ./scripts/shell/compile_manuscript.sh # Merges files, creates cache -
Subsequent compilations (no changes):
bash ./scripts/shell/compile_manuscript.sh # ✓ Bibliography cache valid (no changes detected) # Merge skipped -
After editing a
.bibfile:
bash vim 00_shared/bib_files/methods_refs.bib ./scripts/shell/compile_manuscript.sh # Cache invalid, rebuilding... # Merging 3 bibliography files...
Cache File
Location: 00_shared/bib_files/.bibliography_cache.json
Contents:
{
"input_hash": "6f91cc7eab63a633f61ec01f9a2f7477",
"input_files": [
"field_background.bib",
"methods_refs.bib",
"my_papers.bib"
],
"output_file": "bibliography.bib",
"stats": {
"total_input": 13,
"unique_output": 12,
"duplicates_found": 1,
"duplicates_merged": 1
}
}
Notes:
- Automatically managed (don't edit)
- Git-ignored by default
- Safe to delete (will rebuild on next compile)
Performance
Without caching:
Bibliography Merge: ~2-5s (depends on file size)
With valid cache:
Bibliography Merge: <0.1s
Speedup: 20-50x faster when files unchanged
Manual Merging
Basic Usage
# From project root
python3 scripts/python/merge_bibliographies.py
# Output:
# Merging 3 bibliography files...
# - field_background.bib
# - methods_refs.bib
# - my_papers.bib
# ✓ Merged bibliography saved: 00_shared/bib_files/bibliography.bib
# Input entries: 13
# Unique entries: 12
# Duplicates removed: 1
Command-Line Options
# Force rebuild (ignore cache)
python3 scripts/python/merge_bibliographies.py --force
# Quiet mode (no output)
python3 scripts/python/merge_bibliographies.py --quiet
# Custom directory
python3 scripts/python/merge_bibliographies.py /path/to/bib_files/
# Custom output filename
python3 scripts/python/merge_bibliographies.py -o merged.bib
# Help
python3 scripts/python/merge_bibliographies.py --help
When to Use Manual Merge
- Testing deduplication
- Checking merge results before compilation
- Rebuilding after major reorganization
- Debugging bibliography issues
Demo Files
Three demo files are included to demonstrate the system:
methods_refs.bib (4 entries)
Methods and techniques references:
- Neural signal processing
- Spectral analysis
- Machine learning
- Deep learning
field_background.bib (5 entries)
Field overview papers:
- Computational neuroscience review
- Brain networks
- Cognitive neuroscience
- Systems neuroscience
- Neuroscience textbook
my_papers.bib (4 entries)
Your own publications (with intentional duplicate):
- Novel method (2023) - appears twice with different details
- Previous work (2022)
- Conference paper (2021)
Testing with Demos
# See deduplication in action
python3 scripts/python/merge_bibliographies.py
# Expected:
# Input entries: 13
# Unique entries: 12
# Duplicates removed: 1
# (The duplicate in my_papers.bib is detected and merged)
Replacing Demo Files
Option 1: Delete and start fresh
cd 00_shared/bib_files/
rm methods_refs.bib field_background.bib my_papers.bib bibliography.bib
# Create your own organization
Option 2: Modify demo files
# Replace demo entries with your references
vim 00_shared/bib_files/methods_refs.bib
vim 00_shared/bib_files/field_background.bib
vim 00_shared/bib_files/my_papers.bib
Citation Styles
Citation style is configured in config/manuscript.yaml (or supplementary/revision configs).
Changing Style
Edit config:
# config/manuscript.yaml
citation_style: "unsrtnat" # Order of appearance (default)
# citation_style: "plainnat" # Alphabetical
# citation_style: "abbrvnat" # Abbreviated
# citation_style: "apalike" # APA-like
Compile:
./scripts/shell/compile_manuscript.sh
The citation style is automatically applied to 00_shared/latex_styles/bibliography.tex.
Available Styles
Numbered (Order of Appearance):
- unsrtnat - [1], [2], [3]... (default)
- ieeetr - IEEE style
Numbered (Alphabetical):
- plainnat - [1], [2], [3]... sorted by author
- abbrvnat - Abbreviated names
Author-Year:
- plainnat - (Smith, 2020)
- apalike - APA-like style
- chicago - Chicago style
Journal-Specific:
- elsarticle-num - Elsevier numbered
- naturemag - Nature style
- IEEEtran - IEEE Transactions
See 00_shared/latex_styles/bibliography.tex for full list and details.
Troubleshooting
Problem: Merge Not Running
Symptom:
No .bib files found in 00_shared/bib_files
Solution:
# Check files exist
ls 00_shared/bib_files/*.bib
# Ensure not all named bibliography.bib
ls 00_shared/bib_files/ | grep -v bibliography.bib
Problem: Duplicates Not Removed
Symptom:
Duplicate entries appear in final PDF bibliography.
Diagnosis:
# Run merge manually to see details
python3 scripts/python/merge_bibliographies.py
# Check if duplicates have same DOI
grep "doi = " 00_shared/bib_files/*.bib | grep "YOUR_DOI"
# Or same title+year
grep "title = " 00_shared/bib_files/*.bib | grep "YOUR_TITLE"
Solution:
Ensure duplicates have either:
- Same DOI (most reliable)
- Same title (case-insensitive, normalized) AND same year
Problem: Cache Not Invalidating
Symptom:
Changes to .bib files not reflected after compilation.
Solution:
# Delete cache manually
rm 00_shared/bib_files/.bibliography_cache.json
# Or force rebuild
python3 scripts/python/merge_bibliographies.py --force
# Then compile
./scripts/shell/compile_manuscript.sh
Problem: Merge Script Not Found
Symptom:
ERROR: scripts/python/merge_bibliographies.py not found
Solution:
# Check script exists
ls -la scripts/python/merge_bibliographies.py
# Check permissions
chmod +x scripts/python/merge_bibliographies.py
# Check Python is available
which python3
Problem: bibtexparser Not Installed
Symptom:
ERROR: bibtexparser not installed
Solution:
# Install package
pip install bibtexparser
# Or with Python 3
python3 -m pip install bibtexparser
Problem: Citations Appear as [?]
Symptom:
PDF shows [?] instead of citation numbers.
Causes:
- Bibliography file missing entries
- BibTeX compilation failed
- Citation keys don't match
Solution:
# Check merged bibliography contains your keys
grep "@article{YOUR_KEY" 00_shared/bib_files/bibliography.bib
# Run full compilation (3 passes)
./scripts/shell/compile_manuscript.sh
# Check BibTeX log
cat 01_manuscript/archive/manuscript.blg
Problem: Metadata Lost During Merge
Symptom:
Merged entry missing some fields.
Explanation:
Merge prefers longer/more complete fields. Short fields are replaced by longer ones.
Solution:
Ensure the most complete entry has the longest field values:
# BAD: Complete entry has short fields
@article{key1,
author = {Smith, J.}, # Short
title = {Title}, # Short
doi = {10.1038/nature12345}
}
# GOOD: Complete entry has long fields
@article{key1,
author = {Smith, John and Doe, Jane}, # Long - will be kept
title = {Full Title of Paper}, # Long - will be kept
journal = {Nature},
year = {2020},
doi = {10.1038/nature12345}
}
Best Practices
File Organization
✅ Good:
00_shared/bib_files/
├── methods_signal_processing.bib
├── methods_machine_learning.bib
├── field_neuroscience.bib
├── field_deep_learning.bib
└── our_publications.bib
❌ Avoid:
00_shared/bib_files/
├── refs1.bib
├── refs2.bib
└── misc.bib
Citation Keys
✅ Good (descriptive, unique):
@article{Smith2020_NeuralProcessing,
@article{Chen2021_MachineLearning,
@article{YourName2023_NovelMethod,
❌ Avoid (generic):
@article{paper1,
@article{ref2,
@article{article3,
Adding References
✅ Workflow:
1. Determine topic (methods, background, etc.)
2. Add to appropriate file
3. Compile (auto-merges)
4. Cite in manuscript: \cite{Smith2020_NeuralProcessing}
❌ Don't:
- Edit bibliography.bib directly (it's auto-generated)
- Add duplicates without checking
- Use inconsistent citation key formats
Version Control
Git ignore:
# .gitignore should include
00_shared/bib_files/.bibliography_cache.json
Commit:
# Commit source files
git add 00_shared/bib_files/*.bib
git commit -m "Add new references for deep learning section"
# Do NOT commit
git add 00_shared/bib_files/bibliography.bib # Auto-generated
Summary
| Feature | Description | Benefit |
|---|---|---|
| Multi-file | Organize by topic | Better organization |
| Deduplication | By DOI or title+year | No duplicate citations |
| Auto-merge | During compilation | No manual steps |
| Caching | Hash-based change detection | 20-50x performance |
| Manual merge | merge_bibliographies.py |
Testing & debugging |
Key Files:
- 00_shared/bib_files/*.bib - Source files (edit these)
- 00_shared/bib_files/bibliography.bib - Merged output (auto-generated)
- 00_shared/bib_files/.bibliography_cache.json - Cache (auto-managed)
- scripts/python/merge_bibliographies.py - Merge script
Key Commands:
# Compile (auto-merge)
./scripts/shell/compile_manuscript.sh
# Manual merge
python3 scripts/python/merge_bibliographies.py
# Force rebuild
python3 scripts/python/merge_bibliographies.py --force
For more information, see:
- 00_shared/bib_files/README.md - Quick reference
- 00_shared/latex_styles/bibliography.tex - Citation style details
- scripts/python/merge_bibliographies.py - Implementation