Wednesday, January 9, 2019

GUI automation with PyAutoGUI module

With GUI automation we can write programs that directly control the keyboard and mouse. These programs can control other applications by sending them virtual keystrokes and mouse clicks, just as if we were sitting at our computer and interacting with the applications our self. The pyautogui module has functions for simulating mouse movements, button clicks, and scrolling the mouse wheel.

Let's install the pyautogui module first. Depending on which operating system you’re using, you may have to install some other modules (called dependencies) before you can install PyAutoGUI. After these dependencies are installed, run pip install pyautogui to install PyAutoGUI.(On Windows, there are no other modules to install.) See the output of pip command below:

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Python>pip install pyautogui
Collecting pyautogui
  Downloading https://files.pythonhosted.org/packages/19/ef/438d80abd396fd2d124b
d37c07c765f913723c54197c4c809d85c8ff5a43/PyAutoGUI-0.9.41.tar.gz (50kB)
    100% |████████████████████████████████| 51kB 787kB/s
Collecting pymsgbox (from pyautogui)
  Downloading https://files.pythonhosted.org/packages/b6/65/86379ede1db26c40e797
2d7a41c69cdf12cc6a0f143749aabf67ab8a41a1/PyMsgBox-1.0.6.zip
Collecting PyTweening>=1.0.1 (from pyautogui)
:
:
--Snip--
Successfully built pyautogui pymsgbox PyTweening pyscreeze pygetwindow pyrect
Installing collected packages: pymsgbox, PyTweening, pyscreeze, pyrect, pygetwin
dow, pyautogui
Successfully installed PyTweening-1.0.3 pyautogui-0.9.41 pygetwindow-0.0.3 pymsg
box-1.0.6 pyrect-0.1.4 pyscreeze-0.1.19

C:\Users\Python>

Functions in PyAutoGUI

Python tracks and controls mouse using coordinate system of screen. The mouse functions of PyAutoGUI use x- and y-coordinates. The origin, where x and y are both zero, is at the upper left
corner of the screen. The x-coordinates increase going to the right, and the y-coordinates increase going down. All coordinates are positive integers; there are no negative coordinates. Your resolution is how many pixels wide and tall your screen is. My screen’s resolution is set to 1920×1080, so the coordinate for the upperleft corner will be (0, 0), and the coordinate for the bottom-right corner
will be (1919, 1079).

1. size(): This function is used to get Screen resolution. Now let's make a program to determine the size of our screen, see the following code:

import pyautogui

print(pyautogui.size())

The pyautogui.size() function returns a two-integer tuple of the screen’s width and height in pixels as shown in the output below:

Size(width=1366, height=768)
------------------
(program exited with code: 0)

Press any key to continue . . .

The pyautogui.size() returns (1366, 768) on my computer but on a computer with a 1920×1080
resolution it'll return (1920, 1080); hence depending on your screen’s resolution, your return value may be different.

2. moveTo(): use this function move the mouse in pyautogui module as shown in the following program:

import pyautogui 
pyautogui.moveTo(100, 100, duration = 1) 

This function moves your mouse pointer from it’s current location to x, y coordinate, and takes time as specified by duration argument to do so. When you run this program see your mouse pointer magically moving from its current location to coordinates (100, 100), taking 1 second in this process.

3. moveRel() function: moves the mouse pointer relative to its previous position as shown in the following program:

import pyautogui 
pyautogui.moveRel(0, 50, duration = 1) 

This program will move mouse pointer at (0, 50) relative to its original position. For example, if mouse position before running the code was (1000, 1000), then this code will move the pointer to coordinates (1000, 1050) in duration 1 second.

4. position(): function to get current position of the mouse pointer as shown in the following program:

import pyautogui 
print(pyautogui.position()) 

Output is the coordinates where your mouse was residing at the time of executing the program. In my case the coordinates are:

Point(x=629, y=69)

------------------
(program exited with code: 0)

Press any key to continue . . .

5.click(): Clicking on a certain coordinate on the screen is possible via click method. By default, this click uses the left mouse button and takes place wherever the mouse cursor is currently located. It also provides doubleClick, rightClick methods taking parameter x-coordinate and y-coordinate in all cases as shown in the following program:

import pyautogui 
pyautogui.click(100, 100)
pyautogui.doubleClick(80,80)
pyautogui.rightClick(80,80)
pyautogui.middleClick()

This code performs a typical mouse click at the location (100, 100), double click at (80,80) and right click at (80,80). can pass x- and y-coordinates of the click as optional first and second arguments if you want it to take place somewhere other than the mouse’s current position.

If you want to specify which mouse button to use, include the button keyword argument, with a value of 'left', 'middle', or 'right'. For example, pyautogui.click(80, 150, button='left') will click the left mouse button at the coordinates (80, 150), while pyautogui.click(100, 230, button='right') will perform a right-click at (100, 230).

A full “click” is defined as pushing a mouse button down and then releasing it back up without moving the cursor. We can also perform a click by calling pyautogui.mouseDown(), which only pushes the mouse button down, and pyautogui.mouseUp(), which only releases the button.
These functions have the same arguments as click(), and in fact, the click() function is just a convenient wrapper around these two function calls.

The pyautogui.doubleClick() function will perform two clicks with the left mouse button, while the pyautogui.rightClick() and pyautogui.middleClick() functions will perform a click with the right and
middle mouse buttons, respectively.

6. Dragging the Mouse: PyAutoGUI provides the pyautogui.dragTo() and pyautogui.dragRel()
functions to drag the mouse cursor to a new location or a location relative to its current one. The arguments for dragTo() and dragRel() are the same as moveTo() and moveRel(): the x-coordinate/horizontal movement, the y-coordinate/vertical movement, and an optional duration of time. This functionality can be used at various places, like moving a dialog box, or drawing something automatically using pencil tool in MS Paint. See the following program to draw a square in paint:

import pyautogui 
import time
time.sleep(10)
pyautogui.click()  
# click to put drawing program in focus
distance = 200

pyautogui.dragRel(distance, 0, duration=0.2) # move right
distance = distance - 5
pyautogui.dragRel(0, distance, duration=0.2) # move down
pyautogui.dragRel(-distance, 0, duration=0.2) # move left
distance = distance - 5
pyautogui.dragRel(0, -distance, duration=0.2) # move up


Before running the code, open MS paint in background with pencil tool selected. Now run the code, quickly switch to MS paint before 10 seconds (since we have given 10 second pause time using sleep() function before running the program). After 10 seconds, you will see a square being drawn in MS paint.

The distance variable starts at 200, so the first dragRel() call drags the cursor 200 pixels to the right, taking 0.2 seconds ,distance is then decreased to 195. The second dragRel() call drags the cursor 195 pixels down. The third dragRel() call drags the cursor −195 horizontally (195 to the left) distance is decreased to 190, and the last dragRel() call drags the cursor 190 pixels up which draws the square.

7. scroll(): scroll function takes no. of pixels as argument, and scrolls the screen up to given number of pixels. The size of a unit varies for each operating system and application, so you’ll have to experiment to see exactly how far it scrolls in your particular situation. The scrolling takes place at the mouse cursor’s current position. Passing a positive integer scrolls up, and passing a negative integer scrolls down. See the following program:

import pyautogui,time 

time.sleep(5)
pyautogui.scroll(200)

After running the program, we will have five seconds to click the file editor window to put it in focus. Once the pause is over, the pyautogui.scroll() call will cause the file editor window to scroll up after the five-second delay and then go back down. The downward scrolling happens because IDLE automatically scrolls down to the bottom after executing an instruction. Now just run this program and see the output:

import pyautogui,time 

time.sleep(5)
pyautogui.scroll(-500)

time.sleep(5)
pyautogui.scroll(500)

8. screenshot(): PyAutoGUI has screenshot features that can create an image file based on
the current contents of the screen. These functions can also return a Pillow Image object of the current screen’s appearance. See the following program which describes how to take screenshots in Python:

import pyautogui 

my_sc = pyautogui.screenshot()
print(my_sc.getpixel((0, 0)))
print(my_sc.getpixel((100, 150)))

The my_sc variable will contain the Image object of the screenshot. We the call the getpixel() on the Image object by passing a tuple of coordinates, like (0, 0) or (100, 150), and it tells us the color of the pixel at those coordinates in our image. The return value from getpixel() is an RGB tuple of three integers for the amount of red, green, and blue in the pixel. The output of the above program is shown below:

(106, 148, 205)
(182, 102, 0)
------------------
(program exited with code: 0)

Press any key to continue . . .

In case we need to verify that a single pixel matches a given pixel, call the pixelMatchesColor() function, passing it the X coordinate, Y coordinate, and RGB tuple of the color it represents. See the following program:

import pyautogui 

my_sc = pyautogui.screenshot()
print(my_sc.getpixel((0, 0)))
print(my_sc.getpixel((100, 150)))
print(pyautogui.pixelMatchesColor(0, 0, (106, 148, 205)))
print(pyautogui.pixelMatchesColor(100, 150, (182, 102, 0)))
print(pyautogui.pixelMatchesColor(0, 0, (100, 150, 200)))
print(pyautogui.pixelMatchesColor(100, 150, (150, 100, 50)))

After taking a screenshot and using getpixel() to get an RGB tuple for the color of a pixel at specific coordinates we pass the same coordinates and RGB tuple to pixelMatchesColor().  This should return True. Then we change a value in the RGB tuple and call pixelMatchesColor() again for the same coordinates. This should return false.The output of the above program is shown below:

(106, 148, 205)
(182, 102, 0)
True
True
False
False

------------------
(program exited with code: 0)

Press any key to continue . . .

This method can be useful to call whenever our GUI automation programs are about to call click().
But make sure that the color at the given coordinates must exactly match. If it is even slightly different—for example, (106, 148, 205) instead of (106, 148, 206) then pixelMatchesColor() will return False.

The optional tolerance keyword argument specifies how much each of the red, green, and blue values can vary while still matching. See the following program:

import pyautogui 

my_sc = pyautogui.screenshot()
print(my_sc.getpixel((0, 0)))
print(my_sc.getpixel((100, 150)))
print(pyautogui.pixelMatchesColor(0, 0, (106, 148, 205)))
print(pyautogui.pixelMatchesColor(100, 150, (182, 102, 0)))
print(pyautogui.pixelMatchesColor(0, 0, (100, 150, 200), tolerance=10))
print(pyautogui.pixelMatchesColor(100, 150, (150, 100, 50), tolerance=10))

The output of the above program is shown below:

(106, 148, 205)
(21, 12, 0)
True
False
True
False

------------------
(program exited with code: 0)

Press any key to continue . . .

As you may have noticed now the 5'th result is evaluated to be true as we provided a tolerance value of 10.


9. The Locate Functions: We  can visually locate something on the screen if we have an image file of it. But what if you do not know beforehand where PyAutoGUI should click? In such cases we can use image recognition by giving PyAutoGUI an image of what we want to click and let it figure out the coordinates. We want to click the cancel button from a pop up window.  We can’t call the moveTo() and click() functions if we don’t know the exact screen coordinates of where the cancel buttons are. The pop ups can appear in a slightly different place each time it is launched, causing us to re-find the coordinates each time. However, if you have an image of the button, such as the image of the cancel button, can call the locateOnScreen('cancel.png')  function to get the screen coordinates. 

To see how locateOnScreen() works, I am taking a screenshot of a small area on my screen; then save the image. See the following program:

import pyautogui 

print(pyautogui.locateOnScreen('cancel.png'))

The output of the above program is shown below:

(643, 745, 70, 29)
------------------
(program exited with code: 0)

Press any key to continue . . .


The return value is a 4-integer tuple: (left, top, width, height). If the image can’t be found on the screen, locateOnScreen() returns None.  If the image can be found in several places on the screen,
locateAllOnScreen() will return a Generator object, which can be passed to list() to return a list of four-integer tuples. There will be one fourinteger tuple for each location where the image is found on the screen. See the following program:

import pyautogui 

print(list(pyautogui.locateAllOnScreen('cancel.png')))

The output of the above program is shown below:

[(643, 745, 70, 29), (1007, 801, 70, 29)]
------------------
(program exited with code: 0)

Press any key to continue . . .

Each of the four-integer tuples represents an area on the screen. If our image is only found in one area, then using list() and locateAllOnScreen() just returns a list containing one tuple. Once we have the four-integer tuple for the area on the screen where our image was found, we can click the center of this area by passing the tuple to the center() function to return x- and y-coordinates of the area’s
center. See the following program:

import pyautogui 

button_location = pyautogui.locateOnScreen('cancel.jpg')
button7x, button7y = pyautogui.center(button_location)
pyautogui.click(button7x, button7y)

Once you have center coordinates from center(), passing the coordinates to click() should click the center of the area on the screen that matches the image you passed to locateOnScreen().

There are several “locate” functions. They all start looking at the top-left corner of the screen (or image) and look to the right and then down. Some of them are:
  1. locateOnScreen(image, grayscale=False) - Returns (left, top, width, height) coordinate of first found instance of the image on the screen. Returns None if not found on the screen.
  2. locateCenterOnScreen(image, grayscale=False) - Returns (x, y) coordinates of the center of the first found instance of the image on the screen. Returns None if not found on the screen.
  3. locateAllOnScreen(image, grayscale=False) - Returns a generator that yields (left, top, width, height) tuples for where the image is found on the screen.
  4. locate(needleImage, haystackImage, grayscale=False) - Returns (left, top, width, height) coordinate of first found instance of needleImage in haystackImage. Returns None if not found on the screen.
  5. locateAll(needleImage, haystackImage, grayscale=False) - Returns a generator that yields (left, top, width, height) tuples for where needleImage is found in haystackImage.
These “locate” functions are fairly expensive; they can take a full second to run. The best way to speed them up is to pass a region argument (a 4-integer tuple of (left, top, width, height)) to only search a smaller region of the screen instead of the full screen. See the code below:

 import pyautogui
 button_location = pyautogui.locateOnScreen('cancel.png', region=(0,0, 300, 400))

Optionally, we can pass grayscale=True to the locate functions to give a slight speedup (about 30%). This desaturates the color from the images and screenshots, speeding up the locating but potentially causing false-positive matches. See the code below:

import pyautogui
button7location = pyautogui.locateOnScreen('cancelpng', grayscale=True)
print(button7location)

The output of the above program is shown below:

(1416, 562, 50, 41)
------------------
(program exited with code: 0)

Press any key to continue . . .

10. typewrite(): We can automate typing of string by using typewrite() function. just pass the string which you want to type as argument of this function. See the code below:

import pyautogui 

pyautogui.click(100, 100) 
pyautogui.typewrite("learn Python now!")

First, open a new file editor window (I am using notepad) and position it in the upper-left corner of your screen so that PyAutoGUI will click in the right place to bring it into focus. Now run the program. Python will first send a virtual mouse click to the coordinates (100, 100), which should click the file editor window and put it in focus. The typewrite() call will send the text learn Python now! to the window.

By default, the typewrite() function will type the full string instantly. However, you can pass an optional second argument to add a short pause between each character. This second argument is an integer or float value of the number of seconds to pause. For example, pyautogui.typewrite('Learn Python now!', 0.25) will wait a quarter-second after typing L, another quartersecond after e, and so on. This gradual typewriter effect may be useful for slower applications that can’t process keystrokes fast enough to keep up with PyAutoGUI. For characters such as A or !, PyAutoGUI will automatically simulate holding down the shift key as well. See the code below:

import pyautogui 

pyautogui.click(100, 100) 
pyautogui.typewrite("learn Python now!",0.25)

Not all keys are easy to represent with single text characters. For example,  shift or the arrow keys. In
PyAutoGUI, these keyboard keys are represented by short string values instead: 'esc' for the esc key or 'enter' for the enter key. Instead of a single string argument, a list of these keyboard key strings
can be passed to typewrite().  See the code below:

import pyautogui 
pyautogui.typewrite(["a", "left", "ctrlleft"]) 

This code is equivalent of typing “a”, pressing left arrow key, and pressing left control key. Examine the pyautogui.KEYBOARD_KEYS list to see all possible keyboard key strings that PyAutoGUI will accept. The 'shift' string refers to the left shift key and is equivalent to 'shiftleft'. The same applies for
'ctrl', 'alt', and 'win' strings; they all refer to the left-side key.

11. Functions for Pressing and Releasing the Keyboard: The pyautogui.keyDown() and the  pyautogui.keyUp() will send virtual keypresses and releases to the computer. They are passed a keyboard key string for their argument. PyAutoGUI provides the pyautogui.press() function, which
calls both of these functions to simulate a complete keypress. For example:

pyautogui.press('enter')  # press the Enter key
pyautogui.press('f1')     # press the F1 key
pyautogui.press('left')   # press the left arrow key

The press() function is really just a wrapper for the keyDown() and keyUp() functions, which simulate pressing a key down and then releasing it up. These functions can be called by themselves. For example, to press the left arrow key three times while holding down the Shift key, call the following:

pyautogui.keyDown('shift')  # hold down the shift key
pyautogui.press('left')     # press the left arrow key
pyautogui.press('left')     # press the left arrow key
pyautogui.press('left')     # press the left arrow key
pyautogui.keyUp('shift')    # release the shift key

To press multiple keys similar to what typewrite() does, pass a list of strings to press(). For example:

pyautogui.press(['left', 'left', 'left'])

12. The hotkey() Function: A hotkey or shortcut is a combination of keypresses to invoke some application function. For eg. hotkey for copying a selection is ctrl-C. To make pressing hotkeys or keyboard shortcuts convenient, the hotkey() can be passed several key strings which will be pressed down in order, and then released in reverse order. This code: pyautogui.hotkey('ctrl', 'shift', 'esc') is equivalent to this code:

pyautogui.keyDown('ctrl')
pyautogui.keyDown('shift')
pyautogui.keyDown('esc')
pyautogui.keyUp('esc')
pyautogui.keyUp('shift')
pyautogui.keyUp('ctrl')

With this today's post comes to an end. Till we meet next keep practicing and learning Python as Python is easy to learn!
Share:

0 comments:

Post a Comment