Linux Fu: X Command

Text-based Linux and Unix systems are easy to manipulate. The way the Unix I/O system works you can always fake keyboard input to another program and intercept its output. The whole system is made to work that way. Graphical X11 programs are another matter, though. Is there a way to control X11 programs like you control text programs? The answer to that question depends on exactly what you want to do, but the general answer is yes.

As usual for Linux and Unix, though, there are many ways to get to that answer. If you really want fine-grained control over programs, some programs offer control via a special mechanism known as D-Bus. This allows programs to expose data and methods that other programs can use. In a perfect world your target program will use D-Bus but that is now always the case. So today we’ll look more for control of arbitrary programs.

There are several programs that can control X windows in some way or another. There’s a tool called xdo that you don’t hear much about. More common is xdotool and I’ll show you an example of that. Also, wmctrl can perform some similar functions. There’s also autokey which is a subset of the popular Windows program AutoHotKey.

About xdotool

The xdotool is probably the most useful of the commands when you need to take over GUI programs. It is sort of a Swiss Army knife of X manipulation. However, the command line syntax is a bit difficult and that’s likely because the tool can do lots of different things. Most of the time I interested in its ability to move and resize windows. But it can also send fake keyboard and mouse input, and it can bind actions to things like mouse motion and window events.

Although you can make the tool read from a file, you most often see the arguments right on the command line. The idea is to find a window and then apply things to it. You can find windows by name or use other means such as letting the user click on the desired window.

For example, consider this:

echo Pick Window; xdotool selectwindow type "Hackaday"

If you enter this at a shell prompt, you can click on a window and see the given string appear as if it were typed there by the user. The tools is also capable of sending mouse events and performing a multitude of window operations like changing window focus, changing which desktop is shown, etc.

By the way, some of xdotool’s features require the XTest extension to your X server. I’ve always found this turned on, but if things aren’t working, you’d want to check your X server log to see if that extension is loaded.

What About wmctrl?

The wmctrl program has a lot of similar functions but mostly interacts with your window manager. The only problem is, it uses a standard interface to your window manager and not all window managers support all features. This is one of those things that makes distributing programs for Linux so exciting. No two systems are alike and some aren’t even close!

The wmctrl program shines when you want to do things like switch desktops, maximize windows, and related tasks. However, it can do many of the tasks that xdotool can do, as well.

Using a Big Monitor

I recently switched out my three-monitor setup for a very large 4K monitor. The 43-inch behemoth has a resolution of 3840×2160. That’s great, but I did miss being able to put one program on one monitor and a second (or third) program on another monitor.

The answer was to get the windows to slide into certain positions on the screen. You could use a tiled window manager, but I use KDE (which no longer has a tiling option). It will snap windows to certain positions if you drag them to just the right spot, but that’s not very fast. Also, the snap areas were not all where I wanted them.

My original thought was just to use xdotool and map some keys using KDE’s shortcuts. Control+Alt+1 could snap the current window to the top left of the screen and Control+Alt+0 could maximize. Control+Alt+6 would eat up the right half of the screen and Control+Alt+8 would take up the top  half.

My first attempt at creating the Control+Alt+1 shortcut looked like this:

xdotool getwindowfocus windowmove 0 0 windowsize 1920 1080

The idea is to find the current window, move it to 0,0 and then make it take up a quarter of the screen. Sure, the hardcoded numbers aren’t great, but it works for a single machine set up. You can set the size to 50% 50%, if you prefer. That makes sense for this one, but for the other macros where the position isn’t 0,0 you have to use a hardcoded number anyway.

The first problem was that — in some cases — the moving wasn’t working every time. Reversing the size and the move took care of that.

However, there was still a problem. If you maximize a window, the window size and position values don’t do anything. Here’s where wmctrl can help:

wmctrl -r :ACTIVE: -b remove,maximized_horz,maximized_vert ; xdotool getwindowfocus windowsize 1920 1080 windowmove 0 0

This removes the maximized property from the active window and then applies the xdotool command. However, if you are going to use wmctrl, you might as well say:

wmctrl -r :ACTIVE: -b remove,maximized_horz,maximized_vert -e 0,0,0,1920,1080

The -e option moves the window. The first zero isn’t a typo. It sets the “gravity” of the window and is usually zero. The next four numbers are the corner coordinate and the size. However, you will notice that while xdotool moves the top left corner of the window, wmctrl moves the top left corner of the interior of the window (that is, not including the window decorations). So the result is slightly different.

Of course, you could write a simple bash script to manage all this and work out the math so if your screen size changes you don’t have to change each macro. You could even adjust to make the windows not overlap or do other special effects. For example:

#!/bin/bash

# Change to suit or read them from xrandr
#SCREENX=3840
#SCREENY=2160

# If you don't have xrandr, awk, or yours doesn't put the right
# format out, just hardcode up top
SCREENX=`xrandr -q | awk -F'[ ,]+' '/current/ { print $8 }'`
SCREENY=`xrandr -q | awk -F'[ ,]+' '/current/ { print $10 }'`

# Could adjust the actual locations here if you wanted

HALFX=$(( SCREENX/2 ))
HALFY=$(( SCREENY/2 ))

if [ $# -ne 1 ]
then
ARG="?"
else
ARG="$1"
fi
case "$ARG" in
nw)
TOP=0
LEFT=0
W=$HALFX
H=$HALFY
;;
n)
TOP=0
LEFT=0
W=$SCREENX
H=$HALFY
;;
ne)
TOP=0
LEFT=$HALFX
W=$HALFX
H=$HALFY
;;
w)
TOP=0
LEFT=0
W=$HALFX
H=$SCREENY
;;
center)
TOP=$(( $SCREENY/4 ))
LEFT=$(( $SCREENX/4 ))
W=$HALFX
H=$HALFY
;;

e)
TOP=0
LEFT=$HALFX
W=$HALFX
H=$SCREENY
;;

sw)
TOP=$HALFY
LEFT=0
W=$HALFX
H=$HALFY
;;


s)
TOP=$HALFY
LEFT=0
W=$SCREENX
H=$HALFY

;;

se)
TOP=$HALFY
LEFT=$HALFX
W=$HALFX
H=$HALFY
;;


*)
echo "Usage: winpos (nw, n, ne, w, center, e, sw, s, se)"
exit 1
;;
esac

# do it
# wmctrl -r :ACTIVE:-b remove,maximized_horz,maximized_vert -e 0,$LEFT,$TOP,$W,$H
# or here's another way (note, this will show title bars without adjustment
# the above method will cut them off at the top part of screen
wmctrl -r :ACTIVE: -b remove,maximized_horz,maximized_vert
xdotool getwindowfocus windowsize $W $H windowmove $LEFT $TOP

exit 0

With this script, your keyboard macros can just call the script with a tag like “ne” (Northeast) or “center” to control the window position. Any changes are easy to manage in the script instead of spread over multiple macros.

Summary

There’s a line in one of the Star Trek movies (the real ones, with William Shatner) where Kirk tells someone that you have to learn how things work on a starship. Linux is much the same. There’s not much you can’t do if you can only figure out how and wade through the myriad tools that might be what you want. Sometimes it takes a combination of tools and dealing with the infinite variety of configurations is tough, but you can usually make things happen if you try.

44 thoughts on “Linux Fu: X Command

  1. I’m sure Wayland will have all these features right? I mean the developers have made it so clear that they intend to keep all that cool functionality that brings some of us to choose Linux/Unix in the first place and not just turn it into yet another boring Windows clone. They are only eliminating stuff that nobody at all is using because… well.. I guess games will run better after that. Right?!?!

    1. There is no Wayland. Well there is but it is not a program to run. It is a protocol and a library which implements it. Most development/design efforts went into securing Wayland. The result is the key difference between X and Wayland, in case of the latter, display server and window manager are integrated in a single process.

      1. When everybody and their brother is writing their own competing Wayland compositors I will concede that point. I probably won’t care either because I’ll probably be running Windows or maybe even OSX as everything better about Linux/Unix will finally be gone.. Until then Wayland = Weston.

    2. If you actually understood open source you’d realise why there is a world where both X and Wayland is important. You see features, I see gaping security nightmares. The only thing worse than Wayland, is having ONLY X. The real benefit is to move away from the monoculture that has developed around a 25 year old patch work of ugly code.

      1. And so we got firejail (or similar jail method) and Xpra. Bonus: firejail protects at the filesystem and memory levels too. No compromised app can drop a nasty thing in your shell rc files. Or read you ssh keys. While Xpra can take care of clipboard attacks.

        And proof that there is nothing in the X11 protocol that forces the system to give total access. We could get a secure X11 like we already are getting in terms of dropping root caps if there was the will to find and solve.

        But Linux has been in NIH mode for the past 10 years, kicking out features instead of finishing and fixing what already exists. CADT as jwz (of Netscape fame) said.

      2. And so we got firejail (or similar jail method) and Xpra. Bonus: firejail protects at the filesystem and memory levels too. No compromised app can drop a nasty thing in your shell rc files. Or read you ssh keys. While Xpra can take care of clipboard attacks.

        And proof that there is nothing in the X11 protocol that forces the system to give total access. We could get a secure X11 like we already are getting in terms of dropping root caps if there was the will to find and solve.

        But Linux has been in NIH mode for the past 10 years, kicking out features instead of finishing and fixing what already exists. CADT as jwz (of Netscape fame) said.

        (Try two)

      3. “The only thing worse than Wayland, is having ONLY X.”

        Great! Someone advocating for more fragmentation. As someone who actually uses Linux as a desktop I may have given up the hope that everyone else will too some day but I have not given up hope that at least EVERY application I want or need to run will one day run on Linux. I just can’t wait for the day when the application I want to run only runs on Wayland but I still prefer X because Wayland is a jail and X gives me features and choices.

    1. Yep, that’s why the top 500 supercomputers in the world run Linux and NONE of them run anything Microsoft and have not done for many years now. I did think about posting all the things that you can do with Linux that you cannot do with Windows but I fear it would hit the post limit for a comment. Windows is closed source where as Linux is open source and is extensible so guess what, people are extending what it can do without having to wait for Microsoft to release an upgrade or for a 3rd party to sell you a piece of software to make Windows do something new.

      1. And those who would find the comments above coulnd’t care less of what other things linux can do which windows can’t.

        The point is that those who read the article and thought “neat! Now how could I do that in my windows env?” will find it quite useful. In contrast to another linux vs windows debate unrelated to the article. No one cares… stop it.

  2. X11 was great for it’s time but now it should be considered an insecure windowing system that uses hacked in hardware acceleration. Wayland is better but not great, so there currently a really good windowing system on Linux. :(

      1. Everything is the answer to that. If X may sound great from a feature perspective, but the incredible feature set is built from 25 years of hacks. X under the hood is ugly. It is insecure. Its hacked together design allows all sorts of security issues e.g. one application to take over the window of another. That may not sound serious until you try to e.g. lock the screen, or prevent a phishing window from pretending to be your login screen.

        Wayland is an alternative that was designed from the ground up as a modern system. Is it better feature wise? No. Is it better as a fundamentally simple window manager suited to a fast secure desktop environment? Hell yeah.

        1. Secure, sure. Less code running with elevated privileges? Absolutely. Desktop environment? I guess, but I don’t like GNOME.

          But fast? Don’t make me laugh. Ever since people started porting things to GTK3 all I started noticing was an extra second of black screen after the window was mapped and before the program started putting data in its own buffer, on top of a general argument that I am Using My Computer Wrong.

          Performance in terms of what I see is definitely worse than using Xfree86 with its now-removed XAA.

      2. Remote display.

        Yes, supposedly it does that. But it doesn’t. The claim that it does is only because some coder managed to insert it as a preliminary hack that has never been properly documented and nobody who isn’t fully versed in low-level display server programming has any hope of reproducing.

        Then there’s the Wayland supporters that say remote support should be re-implemented by every application. The genius behind X was that every application can be remote-displayed regardless of if the author intended it or not. Nobody gets to tell the user what they can and cannot do. Also the procedure to remote display one application under X is exactly the same as the procedure for another one.

        Using Wayland on a desktop is like putting a cell-phone carrier in charge of administering it. Somebody else is telling you what you can and cannot do with your own device. It is a movement in the wrong direction.

  3. I love xdotool! Every semester that I’m forced to license an ebook, I use xdotool in a bash script to turn the page while my packet sniffer captures the page images as they load. About an hour for 500 pages and some carefully tweaked compression gets me a somewhat reasonably sized PDF. I was thinking of making a graphical front end so I don’t have to tweak the script every time.

    1. Well… today in GMT timezone… we’ve been seeing hacks all day.
      If you want to take “Hackaday” literately, then the staff around here could register:
      HackadaySpecificallyForTheNeedsOfThatSBRKtroll.com
      With exactly one hack within exactly 24 hours of each blog post.

      These tutorials are here to provoke thoughts and ideas in order to encourage mods, bodges, hacks, makeshifts, जुगाड़, etc…

      It seems to work occasionally by the looks of things.

      More of these tagged as tutorial is very welcome… Especially if someone is trying to configure an embedded system and requires tips from the likes of these to complete their project… Could be for another take on those “Energy monitoring boards” from some major supermarkets.

  4. I use FVWM… with that I am able to set up a menu that can tile windows in the configurations I need:

    DestroyMenu WindowMoveSideMenu
    AddToMenu WindowMoveSideMenu
    + "&Left"	Resize
    + "&Right"	Resize
    + "&Top"	Resize
    + "&Bottom"	Resize
    DestroyMenu WindowMoveCnrMenu
    AddToMenu WindowMoveCnrMenu
    + "Top &Left"		Resize
    + "&Top Right"		Resize
    + "Bottom &Right"	Resize
    + "&Bottom Left"	Resize
    
    DestroyMenu WindowToDeskMenu
    AddToMenu WindowToDeskMenu
    + "To Desk &1"	MoveToDesk 0 0
    + "To Desk &2"	MoveToDesk 0 1
    + "To Desk &3"	MoveToDesk 0 2
    + "To Desk &4"	MoveToDesk 0 3
    
    DestroyMenu WindowToPageMenu
    AddToMenu WindowToPageMenu
    + "To Page &1" MoveToPage 0 0
    + "To Page &2" MoveToPage 1 0
    + "To Page &3" MoveToPage 0 1
    + "To Page &4" MoveToPage 1 1
    
    DestroyMenu GoToDeskMenu
    AddToMenu GoToDeskMenu
    + "Desk &1"	GotoDesk 0 0
    + "Desk &2"	GotoDesk 0 1
    + "Desk &3"	GotoDesk 0 2
    + "Desk &4"	GotoDesk 0 3
    
    DestroyMenu GoToPageMenu
    AddToMenu GoToPageMenu
    + "Page &1" GotoPage 0 0
    + "Page &2" GotoPage 1 0
    + "Page &3" GotoPage 0 1
    + "Page &4" GotoPage 1 1
    
    DestroyFunc SplitTopHalf
    AddToFunc   SplitTopHalf
    + I Maximize True 100 50
    + I Move 0 0
    
    DestroyFunc SplitBottomHalf
    AddToFunc   SplitBottomHalf
    + I Maximize True 100 50
    + I Move 0 50
    
    DestroyFunc SplitLeftHalf
    AddToFunc   SplitLeftHalf
    + I Maximize True 50 100
    + I Move 0 0
    
    DestroyFunc SplitRightHalf
    AddToFunc   SplitRightHalf
    + I Maximize True 50 100
    + I Move 50 0
    
    DestroyMenu WindowSplitHalfMenu
    AddToMenu WindowSplitHalfMenu
    + "&Top"	SplitTopHalf
    + "&Bottom"	SplitBottomHalf
    + "&Left"	SplitLeftHalf
    + "&Right"	SplitRightHalf
    
    DestroyFunc SplitTopLeft
    AddToFunc   SplitTopLeft
    + I Maximize True 50 50
    + I Move 0 0
    
    DestroyFunc SplitTopRight
    AddToFunc   SplitTopRight
    + I Maximize True 50 50
    + I Move 50 0
    
    DestroyFunc SplitBottomLeft
    AddToFunc   SplitBottomLeft
    + I Maximize True 50 50
    + I Move 0 50
    
    DestroyFunc SplitBottomRight
    AddToFunc   SplitBottomRight
    + I Maximize True 50 50
    + I Move 50 50
    
    
    DestroyMenu WindowSplitQrtMenu
    AddToMenu WindowSplitQrtMenu
    + "Top &Left"		SplitTopLeft
    + "&Top Right"		SplitTopRight
    + "Bottom &Right"	SplitBottomRight
    + "&Bottom Left"	SplitBottomLeft
    
    DestroyMenu WindowSplitMenu
    AddToMenu WindowSplitMenu
    + "&2 Half"	Popup WindowSplitHalfMenu
    + "&4 Quarter"	Popup WindowSplitQrtMenu
    
    DestroyMenu WindowMoveMenu
    AddToMenu WindowMoveMenu
    + "&Window"	Move
    + "&Side"	Popup WindowMoveSideMenu
    + "&Corner"	Popup WindowMoveCnrMenu
    + "To &Desk"	Popup WindowToDeskMenu
    + "To Pag&e"	Popup WindowToPageMenu
    
    DestroyMenu QuickLaunchMenu
    AddToMenu QuickLaunchMenu
    + "&Console"		Exec exec lilyterm
    + "&Editor" 		Exec exec qvim
    + "&Web Browser"	Exec exec firefox
    + "&Mail Client"	Exec exec thunderbird
    + "&File Manager"	Exec exec konqueror
    + "&Python Shell"	Exec exec lilyterm -e ipython
    + "&Spreadsheet"	Exec exec gnumeric
    + "&Remote Desktop"	Exec exec krdc
    + "Document &Viewer"	Exec exec okular
    
    DestroyMenu WindowOpsMenu
    AddToMenu WindowOpsMenu
    + "&Close"	Close
    + "&Move/Size"	Popup WindowMoveMenu
    + "Ma&ximise"	Maximize 100 100
    + "Sp&lit"	Popup WindowSplitMenu
    + "&Iconify"	Iconify
    + "&Switch"	WindowList Root c c
    + "More &Ops"	Popup Window
    + ""		Nop
    + "To &Desk"	Popup GoToDeskMenu
    + "To Pag&e"	Popup GoToPageMenu
    + "&Quick Launch" Popup QuickLaunchMenu
    + "Root Menu (&A)" Popup Utilities
    + "BarButtons (&Z)" GoToBarButtons
    
    DestroyMenu RootOpsMenu
    AddToMenu RootOpsMenu
    + "&Switch"	WindowList Root c c
    + ""		Nop
    + "To &Desk"	Popup GoToDeskMenu
    + "To Pag&e"	Popup GoToPageMenu
    + "&Quick Launch" Popup QuickLaunchMenu
    + "Root Menu (&A)" Popup Utilities
    + "BarButtons (&Z)" GoToBarButtons
    
    Key Super_L	FSTW1		N	Menu WindowOpsMenu
    Key Super_L	R		N	Menu RootOpsMenu
    

    With that, I hit the logo key (Meta, Command, “Windows”… whatever you call it). If the cursor is in a window, I have the ability to re-size it to occupy the full screen, ½ screen or ¼ screen, and the ability to switch to a different window, switch desktops or pages, move windows between desktops/pages, close windows or launch new windows.

    Tweaking it a little, it is possible to make that same environment work on a touchscreen. FVWM might be ancient, but so far I’ve found it a very flexible window manager, and it’s an environment I keep coming back to having tried Gnome, KDE, XFCE, CDE, WindowMaker, CTWM, AfterStep and numerous other environments.

    1. Go figure, There are at least three of us, then :-)

      Seriously. After TWM it was FVWM for me. (the old one). Then Gnome. And since the tale of boiling frogs is obviously false, I, at some point run screaming from it when the water became too hot. After some experimenting (Awesome, Xfce) it was back to… FVWM.

      Never looked back.

  5. Don’t ruin HackADay.
    Attempts to ruin the ‘net continue.
    Let SOMEthing, be and shout from
    the rooftop (antennii,) FREE!
    Let something stick in the craw of capitalism
    and bespeak of the modern Colt 45,
    PeacemMaker. Failing that, Equalizer.

  6. Don’t ruin HackADay.
    Attempts to ruin the ‘net continue.
    Let SOMEthing, be and shout from
    the rooftop (antennii,) FREE!
    Let something stick in the craw of capitalism
    and bespeak of the modern Colt 45,
    PeaceMaker. Failing that, Equalizer.
    I am old from SillyComV, but new here.
    You have reinspired the last 50-60 years
    of my subtle hacking.
    Carry-On, as Frank Lloyd Wright, when
    he gifted us with the cinderblock.

      1. At least it wasn’t Microsoft’s Tay AI Bot…
        Otherwise for being human and not metal, we’d already been off to the Skynet “Human Holiday camps” to of been given Skynet certified “Showers” before we had realized too late what happened!

  7. I really tried to report this typo privately, but couldn’t figure out how.

    “In a perfect world your target program will use D-Bus but that is now always the case. ”

    …should be

    “In a perfect world your target program will use D-Bus but that is not always the case. ”

    perfect worlds have perfect grammar ;-).

    … But a great article regardless. Thanks.

    1. Unless you’re presenting something in tutorial form, then it is essentially public domain….

      That is unless your local laws enforce a default of copyright on everything including sub-contents of a tutorial, thus essentially rendering said tutorial completely useless as learning from said tutorial before putting it into practice would infringe on copyright in said barbaric highly restricting country.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.