LeanProductivity MarkItDown Batch Converter (No GUI)

A simple, no-GUI batch converter for converting various file formats to Markdown using MarkItDown.
Built for developers, writers, and knowledge workers who want a local, fast, and scriptable solution.

๐ŸŽž๏ธ Tutorial

๐Ÿ“ฆ Features

๐Ÿงฐ Requirements

๐Ÿš€ Usage

python "LP MID Bulk Converter.py"

You'll be prompted to enter file extensions to convert:

Enter file extensions to convert (comma-separated, e.g. docx,pdf,html):

All matching files under the configured input folder will be recursively converted to .md and saved under the output folder, maintaining structure.

๐Ÿ“ Configuration

Edit these lines in the script to set your folders:

input_folder = Path(r"d:\GitProjects\Input\Demo Files")
output_folder = Path(r"d:\GitProjects\Output")

๐Ÿ“‹ Conversion Summary

After running, the script outputs:

๐Ÿงช Example Output

โ–ถ๏ธ Converting file: test.docx
โœ… Converted: test.md
โญ๏ธ Skipped (up-to-date): test.pdf
โŒ Error converting test.wav: UnknownValueError โ€“ No speech detected
=== Conversion Summary ===
Converted: 12
Skipped: 3
Errors: 1

๐Ÿง‘โ€๐Ÿ’ป Author

Sascha D. Kasper โ€“ LeanProductivity
GitHub | YouTube

๐Ÿ“„ License

MIT License.
See LICENSE.txt for details.

๐Ÿ“ƒ Script

# โ”€โ”€โ”€ APPLICATION METADATA โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
APP_NAME = "LeanProductivity MarkItDown Batch Converter no GUI"
APP_DESCRIPTION = "A no GUI batch converter for MarkItDown to convert various file formats to Markdown."
VERSION = "00.07.20250620"
AUTHOR_NAME = "Sascha D. Kasper - LeanProductivity"
HELP_URL = "https://github.com/microsoft/markitdown"
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

import os
from pathlib import Path
from markitdown import MarkItDown

# --- Optional: FFMPEG path for audio support (if pydub or similar is used) ---
try:
from pydub import AudioSegment
ffmpeg_path = os.path.join("resources", "bin", "ffmpeg.exe")
if not Path(ffmpeg_path).is_file():
raise FileNotFoundError(f"FFmpeg not found at {ffmpeg_path}")
AudioSegment.converter = ffmpeg_path
except ImportError:
pass# pydub not installed or not needed for current file types
except Exception as e:
print(f"โš ๏ธ FFmpeg configuration warning: {e}")

# --- Configuration ---
input_folder = Path(r"d:\GitProjects\Input\Demo Files") # set this to your input folder
output_folder = Path(r"d:\GitProjects\Output") # set this to your output folder

# --- Extension input from user ---
ext_input = input("Enter file extensions to convert (comma-separated, e.g. docx,pdf,html): ")
supported_extensions = {
f".{ext.strip().lower()}"
for ext in ext_input.split(",")
if ext.strip()
}

force_convert = False# set to True to ignore modification times
dry_run = False# set to True to simulate conversion only

# --- Init ---
md = MarkItDown()
files_converted = 0
files_skipped = 0
errors = 0

# --- Conversion ---
for root, _, files in os.walk(input_folder):
for file in files:
src_path = Path(root) / file
if src_path.suffix.lower() not in supported_extensions:
continue

rel_path = src_path.relative_to(input_folder)
dst_path = output_folder / rel_path.with_suffix(".md")
dst_path.parent.mkdir(parents=True, exist_ok=True)

if dst_path.exists() and not force_convert:
if dst_path.stat().st_mtime >= src_path.stat().st_mtime:
print(f"โญ๏ธSkipped (up-to-date): {rel_path}")
files_skipped += 1
continue

if dry_run:
print(f"๐Ÿ”Ž Would convert: {rel_path}")
files_converted += 1
continue

try:
# print(f"โ–ถ๏ธ Converting file: {src_path} (resolved: {src_path.resolve()})") - uncomment for debugging
# print(f"๐Ÿ“‚ Working directory: {os.getcwd()}") - uncomment for debugging
import subprocess
try:
result = subprocess.run(
["markitdown", str(src_path.resolve())],
capture_output=True,
text=True,
check=True
)
with open(dst_path, "w", encoding="utf-8") as f:
f.write(result.stdout)
print(f"โœ… Converted: {rel_path}")
files_converted += 1
except subprocess.CalledProcessError as e:
print(f"โŒ CLI failed for {rel_path}: {e.stderr.strip()}")
errors += 1
except Exception as e:
print(f"โŒ Error converting {rel_path}: {type(e).__name__} โ€“ {e}")
errors += 1
print(f"โœ… Converted: {rel_path}")
files_converted += 1
except Exception as e:
print(f"โŒ Error converting {rel_path}: {e}")
errors += 1

# --- Summary ---
print("\n=== Conversion Summary ===")
print(f"Converted: {files_converted}")
print(f"Skipped: {files_skipped}")
print(f"Errors : {errors}")