Agron Merseli

Python Scripts for Notepad++

Sometimes working on different platforms happens to move sources from one place to another, usually this simple copy-paste operation does not cause problems, but if you work with Eclipse, the sources are treated as simple text files so the file encoding becomes important to avoid surprises especially if there are many sources.

Wrong text encoding

Usually in Eclipse the text files are encoded as “Cp1252”, therefore if you use the classic Latin characters it is necessary that the files are encoded in “UTF-8” and the first option to modify in the IDE is the following.

In Windows – Preferences – General – Workspace – Text file encoding, select Other: UTF-8

At this point proceed importing the sources into the project.

If the sources have already been imported, it is necessary to convert them to the correct encoding.

Here you can use a very useful tool to run macros and scripts written in Python on Notepad++, the plugin is called Python Script and it can be installed from the Notepad++ Plugin Manager.

To create a new script from the Plugin – Python Script – New Script menu, give a name to the script you want to create.

In this case, the script to convert the encoding of files into UTF-8 with BOM is as follows.

import os;
import sys;
filePathSrc="D:\\eclipse\\eclipse-workspace\\LibroJava11\\src\\LibroJava*11"
for root, dirs, files in os.walk(filePathSrc):
    for fn in files:
      if fn[-4:] != '.jar' and fn[-5:] != '.ear' and fn[-4:] != '.gif' and fn[-4:] != '.jpg' and fn[-5:] != '.jpeg' and fn[-4:] != '.xls' and fn[-4:] != '.GIF' and fn[-4:] != '.JPG' and fn[-5:] != '.JPEG' and fn[-4:] != '.XLS' and fn[-4:] != '.PNG' and fn[-4:] != '.png' and fn[-4:] != '.cab' and fn[-4:] != '.CAB' and fn[-4:] != '.ico':
        notepad.open(root + "\\" + fn)
        console.write(root + "\\" + fn + "\r\n")
        #Does not work --> notepad.runMenuCommand("Encoding", "Character sets", "Chinese", "GB2312 (Simplified)")
        # notepad.menuCommand(MENUCOMMAND.FORMAT_GB2312)
        # notepad.runMenuCommand("Encoding", "Convert to UTF-8-BOM")
        notepad.menuCommand(MENUCOMMAND.FORMAT_CONV2_UTF_8)
        # Reference: https://github.com/bruderstein/PythonScript/blob/master/PythonScript/src/NotepadPython.cpp
        notepad.save()
        notepad.close()

Note that I entered the path containing the sources in the “filePathSrc” string and with the “notepad.menuCommand” method I passed the “MENUCOMMAND.FORMAT_CONV2_UTF_8” command which encodes the file.

Save the script once complete.

To launch the script from the Plugins – Python Scripts – Script menu and select the script created.

At this point, reloading the source into Eclipse it can be verified that the file encoding is now correct.

For more information about the Python Scripts plugin and for many other useful scripts visit the developer’s GitHub page.

Rename ConvertToUFT-8-BOM.txt to ConvertToUFT-8-BOM.py

Leave a comment

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.