Page tree
Skip to end of metadata
Go to start of metadata




Performs stemming of tokenized text using the Porter algorithm (M.F. Porter: An algorithm for suffix stripping. In: Program, 14(3), S. 130-137, Juli 1980).

"The Porter stemming algorithm (or ‘Porter stemmer’) is a process for removing the commoner morphological and inflectional endings from words in English."



Tokenized text columnSTEM_WORDS returns
[Unhappily, master, well, himself, curiosity, drew, unconsciously, farther, intended, go, last, having, seen, Parsee, carnival, wind, away, distance, turning, steps, towards, station, happened, espy, splendid, pagoda, Malabar, Hill, seized, irresistible, desire, see, interior, quite, ignorant, forbidden, Christians, enter, certain, Indian, temples, even, faithful, go, without, first, leaving, shoes, outside, door, here, wise, policy, British, Government, severely, punishes, disregard, practices, native, religions]
[Unhappili, master, well, himself, curios, drew, unconsci, farther, intend, go, last, have, seen, Parse, carniv, wind, awai, distanc, turn, step, toward, station, happen, espi, splendid, pagoda, Malabar, Hill, seiz, irresist, desir, see, interior, quit, ignor, forbidden, Christian, enter, certain, Indian, templ, even, faith, go, without, first, leav, shoe, outsid, door, here, wise, polici, British, Govern, sever, punish, disregard, practic, nativ, religion]
[Passepartout, thinking, harm, went, simple, tourist, soon, lost, admiration, splendid, Brahmin, ornamentation, everywhere, met, eyes, sudden, found, himself, sprawling, sacred, flagging, looked, up, behold, three, enraged, priests, forthwith, fell, upon, tore, shoes, began, beat, loud, savage, exclamations, agile, Frenchman, soon, upon, feet, again, lost, time, knocking, down, two, long-gowned, adversaries, fists, vigorous, application, toes, rushing, out, pagoda, fast, legs, carry, soon, escaped, third, priest, mingling, crowd, streets][Passepartout, think, harm, went, simpl, tourist, soon, lost, admir, splendid, Brahmin, ornament, everywher, met, ey, sudden, found, himself, sprawl, sacr, flag, look, up, behold, three, enrag, priest, forthwith, fell, upon, tore, shoe, began, beat, loud, savag, exclam, agil, Frenchman, soon, upon, feet, again, lost, time, knock, down, two, long-gown, adversari, fist, vigor, applic, toe, rush, out, pagoda, fast, leg, carri, soon, escap, third, priest, mingl, crowd, street]
  • This function is not guaranteed to be 100% accurate.
  • The accuracy of results depends on both the training data being used by the function and the comparable quality of input data.
  • In order to assess result accuracy, you need conduct your own performance evaluation.
  • No labels