Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Adding books to my library database

To follow up on the two posts I wrote a couple of weeks ago, I thought I’d explain how I use the Library of Congress catalog to add entries to the SQLite database of my technical books.

As a reminder, the database consists of three tables, book, author, and book_author. The fields for each table are listed below:

book author book_author
id id id
title name book_id
subtitle author_id
volume
edition
publisher
published
lccn
loc
added

In each table, the id field is a sequential integer that’s automatically generated when a new record is added. The book_id and author_id fields in the book_author table are references to the id values in the other two tables. They tie book and author together and handle the many-to-many relationship between the two.

The author table is about as simple as it can be. I don’t care about breaking names into given and surname, nor do I care about their birth/death dates. The names are saved in last, first [middle] format, i.e.,

Ang, Alfredo Hua-Sing
Gere, James M.
King, Wilton W.
McGill, David J.
Tang, Wilson H.
Timoshenko, Stephen

Most of the book fields are self-explanatory. The published field is the publication date (just a year), and added is the date I added the book to the database, which helps when I want to print out information about recently added books. SQLite doesn’t have a date datatype, so added is a string in yyyy-mm-dd format. The loc is the Library of Congress Classification, an alphanumeric identifier similar to the Dewey Decimal system. The lccn is the Library of Congress Control Number, which is basically a serial number that’s prefixed by the year. It has nothing to do with classification by topic or shelving, but it’s the key to quickly collecting all the other data on books in the Library of Congress catalog.

I shelve my books according to the Library of Congress Classification. All other things being equal, I’d prefer to use the Dewey Decimal system because that’s the system used by most of the libraries I’ve patronized.1 But all other things are not equal.

  1. Virtually all of my technical books are in the Library of Congress.
  2. The LoC catalog is freely available online.
  3. The LoC’s records can be downloaded in a convenient format.
  4. Unfortunately, many of the LoC records do not include a Dewey Decimal number.

The advantages of using the LoC Classification far outweigh my short-lived comfort. I’m getting used to my structural engineering books being in the TA600 series instead of the 624.1 series.

The Library of Congress keeps its catalog records in a few different formats. There’s the venerable MARC format, which uses numbers to identify fields and letters to identify subfields. There’s MARCXML, which is a more or less direct translation of MARC into XML. Neither of these were appealing to me. But there’s also the MODS format, which uses reasonable names for the various elements. For example, here’s the MODS record for An Introduction to Dynamics by McGill & King:

xml:
<?xml version="1.0" encoding="UTF-8"?><mods xmlns="http://www.loc.gov/mods/v3" xmlns:zs="http://docs.oasis-open.org/ns/search-ws/sruResponse" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="3.8" xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-8.xsd">
  <titleInfo>
    <nonSort xml:space="preserve">An </nonSort>
    <title>introduction to dynamics</title>
  </titleInfo>
  <titleInfo type="alternative">
    <title>Engineering mechanics</title>
  </titleInfo>
  <name type="personal" usage="primary">
    <namePart>McGill, David J.,</namePart>
    <namePart type="date">1939-</namePart>
  </name>
  <name type="personal">
    <namePart>King, Wilton W.,</namePart>
    <namePart type="date">1937-</namePart>
  </name>
  <typeOfResource>text</typeOfResource>
  <originInfo>
    <place>
      <placeTerm authority="marccountry" type="code">cau</placeTerm>
    </place>
    <dateIssued encoding="marc">1984</dateIssued>
    <issuance>monographic</issuance>
    <place>
      <placeTerm type="text">Monterey, Calif</placeTerm>
    </place>
    <agent>
      <namePart>Brooks/Cole Engineering Division</namePart>
    </agent>
    <dateIssued>c1984</dateIssued>
  </originInfo>
  <language>
    <languageTerm authority="iso639-2b" type="code">eng</languageTerm>
  </language>
  <physicalDescription>
    <form authority="marcform">print</form>
    <extent>xv, 608 p. : ill. (some col.) ; 25 cm.</extent>
  </physicalDescription>
  <note type="statement of responsibility">David J. McGill and Wilton W. King.</note>
  <note>Cover title: Engineering mechanics.</note>
  <note>Includes index.</note>
  <subject authority="lcsh">
    <topic>Dynamics</topic>
  </subject>
  <classification authority="lcc">TA352 .M385 1984</classification>
  <classification authority="ddc" edition="19">620.1/04</classification>
  <identifier type="isbn">0534029337</identifier>
  <identifier type="lccn">83025283</identifier>
  <recordInfo>
    <descriptionStandard>aacr</descriptionStandard>
    <recordContentSource authority="marcorg">DLC</recordContentSource>
    <recordCreationDate encoding="marc">831128</recordCreationDate>
    <recordChangeDate encoding="iso8601">19840409000000.0</recordChangeDate>
    <recordIdentifier>4242715</recordIdentifier>
    <recordOrigin>Converted from MARCXML to MODS version 3.8 using MARC21slim2MODS3-8_XSLT1-0.xsl
                (Revision 1.172 20230208)</recordOrigin>
  </recordInfo>
</mods>

OK, XML isn’t as nice as JSON, but there are Python modules for parsing it, and it’s relatively easy to pick out the elements I want to put in my database.

And if you know a book’s LCCN, you can get its MODS record using a simple URL. The LCCN of McGill & King’s book is 83025283, and the URL to download it is

https://lccn.loc.gov/83025283/mods

That’s very convenient.

It does raise the question, though, of how you get the LCCN of a book. In many of my books, especially the older ones, the LCCN is printed on the copyright page. Here’s a photo of the copyright page of Timoshenko & Gere’s Theory of Elastic Stability:

Theory of Elastic Stability copyright page

The name of the LCCN has changed over the years, and you’ll sometimes see it like this with a dash between the year and the serial number, but it’s easy to convert it to the current canonical form.

If the LCCN isn’t in the book, I use the LoC’s Advanced Search form to find it. This allows searches by title, author, ISBN (my older books don’t have ISBNs, but all the newer books do), or combinations. The record that comes up will always have the LCCN.

However I manage to get the LCCN, I then run the lccn2library command with the LCCN as its argument. That adds the necessary entries to the book, author, and book_author tables and returns the loc catalog value. For McGill & King’s book, it would work like this:

lccn2library 83025283

which returns

TA352 .M385 1984

This typically gets printed on a label that I stick on the spine of the book before shelving it.

Here’s the code for lccn2library. It’s longer than the scripts I usually post here, but that’s because there are a lot of details that have to be handled.

python:
  1:  #!/usr/bin/env python3
  2:  
  3:  import sys
  4:  import re
  5:  import sqlite3
  6:  import requests
  7:  from unicodedata import normalize
  8:  import xml.etree.ElementTree as et
  9:  from datetime import date
 10:  import time
 11:  
 12:  ########## Functions ##########
 13:  
 14:  def canonicalLCCN(lccn):
 15:    """Return an LCCN with no hyphens and the correct number of digits.
 16:  
 17:    20th century LCCNs have a 2-digit year. 21st century LCCNs have a
 18:    4-digit year. The serial number needs to be 6 digits. Pad with
 19:    zeros if necessary."""
 20:  
 21:    # All digits
 22:    if re.search(r'^\d+$', lccn):
 23:      if len(lccn) == 8 or len(lccn) == 10:
 24:        return lccn
 25:      else:
 26:        return lccn[:2] + f'{int(lccn[2:]):06d}'
 27:    # 1-3 lowercase letters followed by digits
 28:    elif m := re.search(r'^([a-z]{1,3})(\d+)$', lccn):
 29:      if len(m.group(2)) == 8 or len(m.group(2)) == 10:
 30:        return lccn
 31:      else:
 32:        return m.group(1) + m.group(2)[:2] + f'{int(m.group(2)[2:]):06d}'
 33:    # 20th century books are sometimes given with a hyphen after the year
 34:    elif m := re.search(r'^(\d\d)-(\d+)$', lccn):
 35:      return m.group(1) + f'{int(m.group(2)):06d}'
 36:    else:
 37:      raise ValueError(f'{lccn} is in an unknown form')
 38:  
 39:  def correctName(name):
 40:    """Return the author name without spurious trailing commas,
 41:    space, or periods."""
 42:  
 43:    # Regex for finding trailing periods that are not from initials
 44:    trailingPeriod = re.compile(r'([^A-Z])\.$')
 45:  
 46:    name = name.rstrip(', ')
 47:    name = trailingPeriod.sub(r'\1', name)
 48:    return(name)
 49:  
 50:  def dbAuthors(cur):
 51:    """Return a dictionary of all the authors in the database.
 52:    Keys are names and values are IDs."""
 53:  
 54:    res = cur.execute('select name, id from author')
 55:    authorList = res.fetchall()
 56:    return dict(authorList)
 57:  
 58:  def addAuthor(cur, name):
 59:    """Add a new author to the database and return the ID."""
 60:  
 61:    params = [name]
 62:    insertCmd = 'insert into author(name) values(?)'
 63:    res = cur.execute(insertCmd, params)
 64:    params = [name]
 65:    idCmd = 'select id from author where name = ?'
 66:    res = cur.execute(idCmd, params)
 67:    return res.fetchone()[0]
 68:  
 69:  def bookData(root):
 70:    """Return a dictionary of information about the book.
 71:  
 72:    Keys are field names and values are from the book.
 73:    If a field name is missing, it's given the value None."""
 74:  
 75:    # Initialize the dictionary
 76:    book = dict()
 77:  
 78:    # Collect the book information from the MODS XML record
 79:    # Use the order in the database: title, subtitle, volume,
 80:    # edition, publisher, published, lccn, loc
 81:  
 82:    # The default namespace for mods in XPath searches
 83:    ns = {'m': 'http://www.loc.gov/mods/v3'}
 84:  
 85:    # Get the title, subtitle, and part/volume
 86:    for t in root.findall('m:titleInfo', ns):
 87:      if len(t.attrib.keys()) == 0:
 88:        # Title
 89:        try:
 90:          starter = t.find('m:nonSort', ns).text
 91:        except AttributeError:
 92:          starter = ''
 93:        book['title'] = starter + t.find('m:title', ns).text.rstrip(', ')
 94:  
 95:        # Subtitle
 96:        try:
 97:          book['subtitle'] = t.find('m:subTitle', ns).text.rstrip(', ')
 98:        except AttributeError:
 99:          book['subtitle'] = None
100:  
101:        # Part/volume
102:        try:
103:          book['volume'] = t.find('m:partName', ns).find.rstrip(', ')
104:        except AttributeError:
105:          book['volume'] = None
106:  
107:    # Get the origin/publishing information
108:    # Edition
109:    try:
110:      book['edition'] = root.find('m:originInfo/m:edition', ns).text
111:    except AttributeError:
112:      book['edition'] = None
113:  
114:    # Publisher
115:    try:
116:      book['publisher'] = root.find('m:originInfo/m:agent/m:namePart', ns).text
117:    except AttributeError:
118:      book['publisher'] = None
119:  
120:    # Date published
121:    try:
122:      book['published'] = root.find('m:originInfo/m:dateIssued', ns).text
123:    except AttributeError:
124:      book['published'] = None
125:  
126:    # ID numbers
127:    # LCCN (must be present)
128:    book['lccn'] = root.find('m:identifier[@type="lccn"]', ns).text
129:  
130:    # LOC classification number (must be present)
131:    book['loc'] = root.find('m:classification[@authority="lcc"]', ns).text
132:  
133:    # Date added to database is today
134:    book['added'] = date.today().strftime('%Y-%m-%d')
135:  
136:    return book
137:  
138:  def authorData(cur, root):
139:    """Return a dictionary of authors of the book, primary first.
140:  
141:    The keys are the author names and values are their IDs.
142:    Authors not already in the database are added to it."""
143:  
144:    # The default namespace for mods in XPath searches
145:    ns = {'m': 'http://www.loc.gov/mods/v3'}
146:  
147:    # Get all the authors (primary and secondary) of the book
148:    # The primary author goes first in the authors list
149:    authors = []
150:    names = root.findall('m:name', ns)
151:    pnames = root.findall('m:name[@usage="primary"]', ns)
152:    snames = list(set(names) - set(pnames))
153:    pnames = [ correctName(n.find('m:namePart', ns).text) for n in pnames ]
154:    snames = [ correctName(n.find('m:namePart', ns).text) for n in snames ]
155:  
156:    # Get the authors already in the database
157:    existingAuthors = dbAuthors(cur)
158:  
159:    # Determine which authors are new to the database and add them.
160:    # The primary author comes first.
161:    authors = dict()
162:    for n in pnames:
163:      if n in existingAuthors.keys():
164:        authors[n] = existingAuthors[n]
165:      else:
166:        newID = addAuthor(cur, n)
167:        authors[n] = newID
168:    for n in snames:
169:      if n in existingAuthors.keys():
170:        authors[n] = existingAuthors[n]
171:      else:
172:        newID = addAuthor(cur, n)
173:        authors[n] = newID
174:  
175:    return authors
176:  
177:  
178:  ########## Main program ##########
179:  
180:  # Connect to the database
181:  con = sqlite3.connect('library.db')
182:  cur = con.cursor()
183:  
184:  # Get the LCCN from the argument
185:  lccn = canonicalLCCN(sys.argv[1])
186:  
187:  # Get and parse the MODS data for the LCCN
188:  r = requests.get(f'https://lccn.loc.gov/{lccn}/mods')
189:  mods = normalize('NFC', r.content.decode())
190:  root = et.fromstring(mods)
191:  
192:  # Collect the book data and add it to the book table
193:  book = bookData(root)
194:  params = list(book.values())
195:  insertCmd = 'insert into book(title, subtitle, volume, edition, publisher, published, lccn, loc, added) values(?, ?, ?, ?, ?, ?, ?, ?, ?);'
196:  res = cur.execute(insertCmd, params)
197:  params = [book["lccn"]]
198:  idCmd = f'select id from book where lccn = ?'
199:  res = cur.execute(idCmd, params)
200:  bookID = res.fetchone()[0]
201:  
202:  # Collect the authors, adding the new ones to the author table
203:  authors = authorData(cur, root)
204:  
205:  # Add entries to the book_author table
206:  for authorID in authors.values():
207:    params = [bookID, authorID]
208:    insertCmd = f'insert into book_author(book_id, author_id) values(?, ?)'
209:    res = cur.execute(insertCmd, params)
210:  
211:  # Commit and close the database
212:  con.commit()
213:  con.close()
214:  
215:  # Print the LOC classification number
216:  print(book['loc'])

The script starts with a couple of utility functions, canonicalLCCN and correctName. The former (Lines 12–37) takes an LCCN as its argument and returns it in the form needed in the URL we talked about above. For 20th century books, that form is a two-digit year followed by a six-digit serial number. For 21st century books, it’s a four-digit year followed by a six-digit serial number. In both cases, the serial number part is padded with zeros to make it six digits long. Hyphens are removed. Oh, and there can sometimes be 1–3 lowercase letters in front of the digits.

correctName (Lines 39–48) is necessary because I noticed that sometimes the authors’ names are followed by spurious commas or periods. You can see trailing commas in both authors’ names in the MODS file shown above. I think these extra bits of punctuation made some sense in the MARC format, but I don’t want them in my database.

Often, the book I’m adding to the database has one or more authors that are already entered in the author table. I don’t want them entered again, so I use the dbAuthors function (Lines 50–56) to query the database for all the authors and put them in a dictionary. The dictionary may seem backwards—its keys are the names and the values are the IDs—but that makes it easy to look up an author’s ID by their name.

The addAuthor function (Lines 58–67) does what you’d expect: it executes an SQL command to add a new author to the author table. The return value is the author’s ID.

The bookData function (Lines 69–136) is by far the longest function. It starts at the root of the XML element tree and pulls out all of the elements needed for the book table entry. It returns a dictionary in which the keys are the book field names (other than id, which will be automatically generated), and the values are the corresponding entries in the MODS file. If there is no entry for, say, a subtitle or volume number, the dictionary is given a None value for that key.

I’m using the ElementTree module for parsing and searching the MODS, and its find and findall functions want the namespace of the elements they’re looking for when the XML data has more than one namespace. As you can see in the first line of the example above, there are three namespaces in MODS, the first of which is

http://www.loc.gov/mods/v3

That’s the reason for the ns dictionary defined on Line 83 and the m: prefix in all the element names.

Searching for the book fields takes up many lines of code, partly because MODS has nested data and partly because some of the fields may not be present. That’s why there are several try/except blocks.

One last thing: the added field (Line 128) comes from the date on which lccn2library is run. It has nothing to do with the MODS data.

The last function is authorData (Lines 138–175). It pulls the names of the authors from the MODS data, distinguishing between the primary author and the others. It then uses dbAuthors (Line 151) to figure out which of this book’s authors are already in the database. Those that aren’t in the database are added to it using the addAuthor function described above. A dictionary of all the book’s authors is returned. As with dbAuthors, the keys are the author names and the values are the author IDs. The primary author comes first in the dictionary.2

The main program starts on Line 181 by connecting to the database and setting up a “cursor” for executing commands. The LCCN argument is put in canonical form (Line 185), which is then used with the requests library to download the MODS data.

Before parsing the XML, I normalize the Unicode data into the NFC format on Line 189. This means that something like é is made into a single character, rather than an e followed by a combining acute accent character. I did this because some utilities I use to format the results of database queries don’t like combining characters.

With the root of the MODS element tree defined in Line 190, the script then calls bookData to get the dictionary of book info. That is then inserted into the book table on Lines 194–196 using SQL command execution with parameters. The ID of the newly added book—which was automatically generated by the insert command—is then gathered from the database and put in the bookID variable in Lines 197–200.

The authors are added to the author table by calling authorData on Line 203. The dictionary of author names and IDs is saved in the authors variable.

The entries in the book_author table are inserted in Lines 206–209 using the bookID and authors values.

With all the table insertions done, the changes are committed and the database connection closed in Lines 212-213. The loc field for the book is printed out by Line 216.

Phew!

As you might imagine, I didn’t run lccn2library by hand hundreds of times when I was initially populating the database. No, I made files with lists of LCCNs, one per line, and ran commands like

xargs -n 1 lccn2library < lccn-list.txt

Giving the -n 1 option to xargs insures that lccn2library will consume only one argument each time it’s run.

I typically did this in chunks of 20-50 LCCNs, mainly because every time I thought I finally had lccn2library debugged, some new MODS input would reveal an error. I feel certain there are still bugs in it, but none have turned up in a while.

There are currently about 500 books in the database, and I think I’ve cleaned out every nook and cranny in the house where books might be hiding. Of course, books still somehow show up when I visit used book stores or (this is the most dangerous) AbeBooks.

Foppl books at AbeBooks


  1. Shout out to librarians for keeping the word “patron” alive and kicking. 

  2. For the last several point releases of Python, dictionaries maintain the order in which they were built. Before that, you had to use the OrderedDict class. The value of putting the primary author first is it insures that scripts like bytitle and byauthor–discussed in my earlier posts—will return the list of authors with the primary author first. That’s how I think of the books. It’s McGill & King, not King & McGill. 


Human nature

On my walk this morning, I was greeted by this scene at the entrance to the Springbrook Prairie Preserve.

Springbrook Prairie entrance with bags of dogshit

By the rocks in the lower right corner and elsewhere on that side of the path were about a dozen colorful bags of dogshit.

I’ve downsized the image so you can’t zoom in to read the little white sign stuck in the ground near the center of the frame. Here it is:

Sign telling people not to leave bags of dogshit in the prairie

It says

Please! Keep our beautiful preserve clean. Haul your dog waste bags out of the park with you. Thank you.

Bags of shit left behind by dog walkers are not uncommon in Springbrook, but I’ve never seen so many in one spot. Maybe this is my general misanthropy talking, but I couldn’t help but think that the bags were deliberately carried and dropped there as a “fuck you” to the sign writer. “You can’t tell me what to do.”

No question, the sign writer was being passive aggressive and probably should have expected that reaction. I would have, but that’s probably my misanthropy coming out again.

I just got back from my second walk to Springbrook today. I took a large bag, gathered up all the small bags, and put them in my garbage bin at home. The bins go out tonight, so it’s not like I’m hoarding dogshit.

I didn’t do this because I’m a good person. My wife was the good person, and she would have been appalled. Not by the dogshit but by the plastic bags, which would tear apart (some already had) and spread tiny bits of plastic over the prairie to be eaten by the birds and coyotes who live there. She would’ve gone back to pick up the bags, so I had to.


Priced in

One economic constant you can hang onto in volatile times is that experts will appear on your TV and computer screens to tell you that The Market has already priced in whatever big change we’re going through. This is because the Masters of the Universe have such finely tuned senses that they can predict with great accuracy what the rest of us haven’t the slightest notion of.

It doesn’t matter how demonstrably false this is; they will say it anyway, as if there’s some trigger in their heads that makes this come out of their mouths and keyboards whenever they’re asked to explain the goings-on of Wall Street. This even happened on Thursday evening, after stocks had fallen off a cliff. “Oh no,” I heard several say, “The Market had already mostly accounted for Trump’s tariffs. It just needed to adjust to their size.”

Here are the closing values of the S&P 500 index from the beginning of October through last Friday:

S&P 500 Index

What do you think? Trump ran on a policy of raising tariffs. It was one of the few things he was consistent on. And it was well known that he wasn’t going to bring any traditional Republicans into his administration who might slow-walk his proposals. So did The Market lose value in anticipation of the Trump tariffs back in October, when it looked like he had a good chance of winning? What about after the November election? After his inauguration?

There was definitely a drop for about three weeks in late February and early March. This was three weeks after he announced the tariffs on Canada, China, and Mexico—does that count as a prediction? The drop did start about a week before the end of the 30-day pause on those tariffs, so maybe that’s considered a prediction on Wall Street. But some of that drop was erased by a rise over the next couple of weeks. What signs of optimism were there during that period?

The S&P 500 even went up last Wednesday, when it was known that he was going to make an announcement after trading closed. By that time everyone knew he was going to say something bad—that’s the reason he waited until after closing. But still there was no anticipatory drop.

I confess I didn’t read/watch/listen to any analysts on Friday night. Maybe they all said they’d been wrong about The Market and would never make that mistake again. And maybe my 401k went up.


A slashing odyssey

In the Mastodon announcement of my last post, I included some text that was struck through:

☃️ A shortcut I use to c̸h̸e̸a̸t̸ a̸t̸ play the NY Times Connections game. https://leancrew.com/all-this/2025/04/play-connections/

This was not done with <s> tags. Those use a horizontal line to strike through, not diagonals, and besides, Mastodon doesn’t support HTML tags.

Instead, each struck-through character was followed by the Long Solidus Overlay combining character. Combining characters overstrike the previous character, which is just what I wanted.

I used to have a Service that created struck-through text like this, but that was long ago and was written in Python 2, which I don’t use anymore. For the particular seven characters shown above, I just copied the combining solidus character from its FileFormat page and pasted it after each character I wanted struck through. That was the quickest way to get it done for a one-off.

But I do like making jokey strikethroughs like that, so a new Quick Action1 for doing it seemed in order. Because the world has changed since that post in 2014, it seemed best to use Shortcuts to invoke a Python 3 script and save it as a Quick Action.

Here’s Shortcuts with the Quick Action:

Slash Text Shortcut

The Python code in the Run Shell Script step is this:

python:
from sys import stdin
from string import whitespace

orig = stdin.read().rstrip('\n')
print(''.join(c if c in whitespace else c + '\u0338' for c in orig), end='')

I found that Shortcuts was adding a linefeed to the end of the selected text, so that’s why the rstrip('\n') method was added to read. The generator expression inside the join function uses Python’s somewhat tricky conditional expression, which puts an if-then-else on a single line but in then-if-else order. The generator expression adds the solidus combining character (\u0338) after each character that’s not whitespace.

Using this Quick Action is simple: select the text you want to strike through, control-click and choose Slash Text from the Services submenu. Like this:

Choosing Slash Text from the Services submenu

When I tested the Quick Action in some common text editing applications, like BBEdit, Mail, Pages, TextEdit, and even Stickies, it worked just fine. But you know I wouldn’t write a sentence like that if there weren’t another shoe waiting to drop. And you’re right. It didn’t work in Mona or Messages, the two apps in which I’d use it most frequently.

Error alert from Messages

Why did it fail in those apps? I have no idea, but whatever the reason, it was no good to me in its current state. What to do?

My first thought was to revert to using Automator to make the Quick Action. It’s supposed to be getting phased out in favor of Shortcuts, but it still works. Here’s my Slash Text Automator workflow:

Slash Text Automator workflow

The shell script is now run via bash:

source $HOME/.bashrc

slashtext

I couldn’t just use the Python code in Automator because it doesn’t allow me to use either /usr/bin/python3 or my Homebrew-installed Python as the shell. The shells it allows are

  • /bin/bash
  • /bin/csh
  • /bin/ksh
  • /bin/sh
  • /bin/tcsh
  • /bin/zsh
  • /usr/bin/perl
  • /usr/bin/ruby
  • /usr/local/bin/python
  • /usr/local/bin/python3

These are apparently unchangeable because they’re in a plist file on a read-only partition. Since I don’t have a Python executable in /usr/local/bin,2 my workaround was to save my Python script as a file, slashtext, in my $PATH and call it via bash. The source code of slashfile is the same Python code as before but with a shebang line:

python:
#!/usr/bin/env python3

from sys import stdin
from string import whitespace

orig = stdin.read().rstrip('\n')
print(''.join(c if c in whitespace else c + '\u0338' for c in orig), end='')

Going back to the shell script, my .bashrc file sets my $PATH when I work in the Terminal, so sourceing it gives the Quick Action my usual environment. That’s why I can call slashtext directly in the following line.

This Quick Action worked in every application I tried, an indication that Shortcuts still has a ways to go before it’s as reliable as Automator.

Speaking of reliability, my final thought was that I should give up on Quick Actions entirely and just make a Keyboard Maestro macro (download). Peter Lewis’s app is more trustworthy than most of what Apple’s put out recently. With the Python script already written, the only real work was adding some actions on either side of the script to deal with the clipboard. Here’s a screenshot of the macro:

KM Slash Text

Both the rstrip('\n') and the end=' ' are unnecessary here because no linefeed is added to the input and Keyboard Maestro strips trailing linefeeds from the output by default. But leaving those bits of code in doesn’t hurt anything.

Typing ⌥⇧⌘/ is certainly faster than control-clicking and navigating the Services submenu. Will I remember this keystroke combination? Probably not, but since I have KeyCue installed, I don’t need to. All I have to do is press the Control key, wait a couple of seconds for the KeyCue window to appear and choose Slash Text.

KeyCue window

That’s still easier than using the Services submenu.

By the way, Keyboard Maestro allows you to export macros as Text Services. For this one you’d first have to change the input of the shell script to the %TriggerValue% token and delete a couple of the clipboard actions, but otherwise the macro would be the same. I didn’t do it that way because I’d already convinced myself to avoid Quick Actions/Services.

If you really want to stick with Apple-supplied software while still having a fast way to slash text, you can build the Automator workflow, save it as a Quick Action, and then set a keyboard shortcut to it via the System Settings app. I’d use that solution if Peter Lewis or the Ergonis people decided to retire and no one took up their apps.

Update 6 Apr 2025 6:25 PM
John Gruber shared a Keyboard Maestro macro he uses to skip over the complicated mousework needed to access the Services menu. The macro, run via a keystroke combination, opens the submenu but doesn’t select any command. He then starts typing, which selects the Quick Action he wants, and hits Return to execute it. This takes longer to describe than it takes to do.

I like the hybrid nature of this approach. By tucking Quick Actions away in the Services menu, Apple has made them hard to get at. This macro brings them out into the light and requires the memorization of only one keystroke combination. It parallels how I use KeyCue to remind me of Keyboard Maestro macros whose keyboard shortcuts I’ve forgotten.


  1. Why did Apple decide to change their name from Services to Quick Actions even though they’re still invoked from the Services menu? Maybe it was to distinguish the action from the menu, but if that were the case, why do we now build shortcuts in the Shortcuts app?