Hello!

I am pleased to announce a new version of my CLI text processing with GNU awk ebook. This book will dive deep into field processing, show examples for filtering features, multiple file processing, how to construct solutions that depend on multiple records, how to compare records and fields between two or more files, how to identify duplicates while maintaining input order and so on. Regular expressions will also be discussed in detail.

Book links

To celebrate the new release, you can download the PDF/EPUB versions for free till 06-April-2025.

Or, you can read it online at https://learnbyexample.github.io/learn_gnuawk/

Interactive TUI apps

Feedback

I would highly appreciate it if you’d let me know how you felt about this book. It could be anything from a simple thank you, pointing out a typo, mistakes in code snippets, which aspects of the book worked for you (or didn’t!) and so on.

Happy learning :)

  • learnbyexample@programming.devOP
    link
    fedilink
    English
    arrow-up
    0
    ·
    4 days ago

    Well, if you are comfortable with Python scripts, there’s not much reason to switch to awk. Unless perhaps you are equating awk to Python as scripting languages instead of CLI usage (like grep, sed, cut, etc) as my ebook focuses on. For example, if you have space separated columns of data, awk '{print $2}' will give you just the second column (no need to write a script when a simple one-liner will do). This of course also allows you to integrate with shell features (like globs).

    As a practical example, I use awk to filter and process particular entries from financial data (which is in csv format). Just a case of easily arriving at a solution in a single line of code (which I then save it for future use).

    • Baldur Nil@programming.dev
      link
      fedilink
      arrow-up
      0
      ·
      edit-2
      4 days ago

      Also AWK is made to be fast, right? I suppose doing something in CPython in a non efficient way might not be noticeable with a bit of text, but would show up with a large enough data stream.