lammps/doc/Section_python.txt

641 lines
26 KiB
Plaintext
Raw Normal View History

"Previous Section"_Section_modify.html - "LAMMPS WWW Site"_lws - "LAMMPS Documentation"_ld - "LAMMPS Commands"_lc - "Next Section"_Section_errors.html :c
:link(lws,http://lammps.sandia.gov)
:link(ld,Manual.html)
:link(lc,Section_commands.html#comm)
:line
11. Python interface to LAMMPS :h3
This section describes how to build and use LAMMPS via a Python
interface.
11.1 "Setting necessary environment variables"_#py_1
11.2 "Building LAMMPS as a shared library"_#py_2
11.3 "Extending Python with MPI to run in parallel"_#py_3
11.4 "Testing the Python-LAMMPS interface"_#py_4
11.5 "Using LAMMPS from Python"_#py_5
11.6 "Example Python scripts that use LAMMPS"_#py_6 :ul
The LAMMPS distribution includes the file python/lammps.py which wraps
the library interface to LAMMPS. This file makes it is possible to
run LAMMPS, invoke LAMMPS commands or give it an input script, extract
LAMMPS results, an modify internal LAMMPS variables, either from a
Python script or interactively from a Python prompt. You can do the
former in serial or parallel. Running Python interactively in
parallel does not generally work, unless you have a package installed
that extends your Python to enable multiple instances of Python to
read what you type.
"Python"_http://www.python.org is a powerful scripting and programming
language which can be used to wrap software like LAMMPS and other
packages. It can be used to glue multiple pieces of software
together, e.g. to run a coupled or multiscale model. See "Section
section"_Section_howto.html#howto_10 of the manual and the couple
directory of the distribution for more ideas about coupling LAMMPS to
other codes. See "Section_start 4"_Section_start.html#start_5 about
how to build LAMMPS as a library, and "Section_howto
19"_Section_howto.html#howto_19 for a description of the library
interface provided in src/library.cpp and src/library.h and how to
extend it for your needs. As described below, that interface is what
is exposed to Python. It is designed to be easy to add functions to.
This can easily extend the Python inteface as well. See details
below.
By using the Python interface, LAMMPS can also be coupled with a GUI
or other visualization tools that display graphs or animations in real
time as LAMMPS runs. Examples of such scripts are inlcluded in the
python directory.
Two advantages of using Python are how concise the language is, and
that it can be run interactively, enabling rapid development and
debugging of programs. If you use it to mostly invoke costly
operations within LAMMPS, such as running a simulation for a
reasonable number of timesteps, then the overhead cost of invoking
LAMMPS thru Python will be negligible.
Before using LAMMPS from a Python script, you have to do two things.
You need to set two environment variables. And you need to build
LAMMPS as a dynamic shared library, so it can be loaded by Python.
Both these steps are discussed below. If you wish to run LAMMPS in
parallel from Python, you also need to extend your Python with MPI.
This is also discussed below.
The Python wrapper for LAMMPS uses the amazing and magical (to me)
"ctypes" package in Python, which auto-generates the interface code
needed between Python and a set of C interface routines for a library.
Ctypes is part of standard Python for versions 2.5 and later. You can
check which version of Python you have installed, by simply typing
"python" at a shell prompt.
:line
:line
11.1 Setting necessary environment variables :link(py_1),h4
For Python to use the LAMMPS interface, it needs to find two files.
The paths to these files need to be added to two environment variables
that Python checks.
The first is the environment variable PYTHONPATH. It needs
to include the directory where the python/lammps.py file is.
For the csh or tcsh shells, you could add something like this to your
~/.cshrc file:
setenv PYTHONPATH ${PYTHONPATH}:/home/sjplimp/lammps/python :pre
The second is the environment variable LD_LIBRARY_PATH, which is used
by the operating system to find dynamic shared libraries when it loads
them. It needs to include the directory where the shared LAMMPS
library will be. Normally this is the LAMMPS src dir, as explained in
the following section.
For the csh or tcsh shells, you could add something like this to your
~/.cshrc file:
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/home/sjplimp/lammps/src :pre
As discussed below, if your LAMMPS build includes auxiliary libraries,
they must also be available as shared libraries for Python to
successfully load LAMMPS. If they are not in default places where the
operating system can find them, then you also have to add their paths
to the LD_LIBRARY_PATH environment variable.
For example, if you are using the dummy MPI library provided in
src/STUBS, you need to add something like this to your ~/.cshrc file:
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/home/sjplimp/lammps/src/STUBS :pre
If you are using the LAMMPS USER-ATC package, you need to add
something like this to your ~/.cshrc file:
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/home/sjplimp/lammps/lib/atc :pre
:line
11.2 Building LAMMPS as a shared library :link(py_2),h4
Instructions on how to build LAMMPS as a shared library are given in
"Section_start 5"_Section_start.html#start_5. A shared library is one
that is dynamically loadable, which is what Python requires. On Linux
this is a library file that ends in ".so", not ".a".
>From the src directory, type
make makeshlib
make -f Makefile.shlib foo
where foo is the machine target name, such as linux or g++ or serial.
This should create the file liblmp_foo.so in the src directory, as
well as a soft link liblmp.so which is what the Python wrapper will
load by default. If you are building multiple machine versions of the
shared library, the soft link is always set to the most recently built
version.
Note that as discussed in below, a LAMMPS build may depend on several
auxiliary libraries, which are specified in your low-level
src/Makefile.foo file. For example, an MPI library, the FFTW library,
a JPEG library, etc. Depending on what LAMMPS packages you have
installed, the build may also require additional libraries from the
lib directories, such as lib/atc/libatc.so or lib/reax/libreax.so.
You must insure that each of these libraries exist in shared library
form (*.so file for Linux systems), or either the LAMMPS shared
library build or the Python load of the library will fail. For the
load to be successful all the shared libraries must also be in
directories that the operating system checks. See the discussion in
the preceding section about the LD_LIBRARY_PATH environment variable
for how to insure this.
Note that some system libraries, such as MPI, if you installed it
yourself, may not be built by default as shared libraries. The build
instructions for the library should tell you how to do this.
For example, here is how to build and install the "MPICH
library"_mpich, a popular open-source version of MPI, distributed by
Argonne National Labs, as a shared library in the default
/usr/local/lib location:
:link(mpich,http://www-unix.mcs.anl.gov/mpi)
./configure --enable-shared
make
make install :pre
You may need to use "sudo make install" in place of the last line if
you do not have write priveleges for /usr/local/lib. The end result
should be the file /usr/local/lib/libmpich.so.
Note that not all of the auxiliary libraries provided with LAMMPS have
shared-library Makefiles in their lib directories. Typically this
simply requires a Makefile.foo that adds a -fPIC switch when files are
compiled and a "-fPIC -shared" switches when the library is linked
with a C++ (or Fortran) compiler, as well as an output target that
ends in ".so", like libatc.o. As we or others create and contribute
these Makefiles, we will add them to the LAMMPS distribution.
:line
11.3 Extending Python with MPI to run in parallel :link(py_3),h4
If you wish to run LAMMPS in parallel from Python, you need to extend
your Python with an interface to MPI. This also allows you to
make MPI calls directly from Python in your script, if you desire.
There are several Python packages available that purport to wrap MPI
as a library and allow MPI functions to be called from Python.
These include
"pyMPI"_http://pympi.sourceforge.net/
"maroonmpi"_http://code.google.com/p/maroonmpi/
"mpi4py"_http://code.google.com/p/mpi4py/
"myMPI"_http://nbcr.sdsc.edu/forum/viewtopic.php?t=89&sid=c997fefc3933bd66204875b436940f16
"Pypar"_http://datamining.anu.edu.au/~ole/pypar :ul
All of these except pyMPI work by wrapping the MPI library (which must
be available on your system as a shared library, as discussed above),
and exposing (some portion of) its interface to your Python script.
This means Python cannot be used interactively in parallel, since they
do not address the issue of interactive input to multiple instances of
Python running on different processors. The one exception is pyMPI,
which alters the Python interpreter to address this issue, and (I
believe) creates a new alternate executable (in place of "python"
itself) as a result.
In principle any of these Python/MPI packages should work to invoke
LAMMPS in parallel and MPI calls themselves from a Python script which
is itself running in parallel. However, when I downloaded and looked
at a few of them, their documentation was incomplete and I had trouble
with their installation. It's not clear if some of the packages are
still being actively developed and supported.
The one I recommend, since I have successfully used it with LAMMPS, is
Pypar. Pypar requires the ubiquitous "Numpy
package"_http://numpy.scipy.org be installed in your Python. After
launching python, type
import numpy :pre
to see if it is installed. If not, here is how to install it (version
1.3.0b1 as of April 2009). Unpack the numpy tarball and from its
top-level directory, type
python setup.py build
sudo python setup.py install :pre
The "sudo" is only needed if required to copy Numpy files into your
Python distribution's site-packages directory.
To install Pypar (version pypar-2.1.0_66 as of April 2009), unpack it
and from its "source" directory, type
python setup.py build
sudo python setup.py install :pre
Again, the "sudo" is only needed if required to copy PyPar files into
your Python distribution's site-packages directory.
If you have successully installed Pypar, you should be able to run
python serially and type
import pypar :pre
without error. You should also be able to run python in parallel
on a simple test script
% mpirun -np 4 python test.py :pre
where test.py contains the lines
import pypar
print "Proc %d out of %d procs" % (pypar.rank(),pypar.size()) :pre
and see one line of output for each processor you run on.
:line
11.4 Testing the Python-LAMMPS interface :link(py_4),h4
To test if LAMMPS is callable from Python, launch Python interactively
and type:
>>> from lammps import lammps
>>> lmp = lammps() :pre
If you get no errors, you're ready to use LAMMPS from Python.
If the load fails, the most common error to see is
"CDLL: asdfasdfasdf"
which means Python was unable to load the LAMMPS shared library. This
can occur if it can't find the LAMMMPS library; see the environment
variable discussion "above"_#python_1. Or if it can't find one of the
auxiliary libraries that was specified in the LAMMPS build, in a
shared dynamic library format. This includes all libraries needed by
main LAMMPS (e.g. MPI or FFTW or JPEG), system libraries needed by
main LAMMPS (e.g. extra libs needed by MPI), or packages you have
installed that require libraries provided with LAMMPS (e.g. the
USER-ATC package require lib/atc/libatc.so) or system libraries
(e.g. BLAS or Fortran-to-C libraries) listed in the
lib/package/Makefile.lammps file. Again, all of these must be
available as shared libraries, or the Python load will fail.
Python (actually the operating system) isn't verbose about telling you
why the load failed, so go through the steps above and in
"Section_start 5"_Section_start.html#start_5 carefully.
[Test LAMMPS and Python in serial:] :h5
To run a LAMMPS test in serial, type these lines into Python
interactively from the bench directory:
>>> from lammps import lammps
>>> lmp = lammps()
>>> lmp.file("in.lj") :pre
Or put the same lines in the file test.py and run it as
% python test.py :pre
Either way, you should see the results of running the in.lj benchmark
on a single processor appear on the screen, the same as if you had
typed something like:
lmp_g++ < in.lj :pre
[Test LAMMPS and Python in parallel:] :h5
To run LAMMPS in parallel, assuming you have installed the
"Pypar"_http://datamining.anu.edu.au/~ole/pypar package as discussed
above, create a test.py file containing these lines:
import pypar
from lammps import lammps
lmp = lammps()
lmp.file("in.lj")
print "Proc %d out of %d procs has" % (pypar.rank(),pypar.size()),lmp
pypar.finalize() :pre
You can then run it in parallel as:
% mpirun -np 4 python test.py :pre
and you should see the same output as if you had typed
% mpirun -np 4 lmp_g++ < in.lj :pre
Note that if you leave out the 3 lines from test.py that specify Pypar
commands you will instantiate and run LAMMPS independently on each of
the P processors specified in the mpirun command. In this case you
should get 4 sets of output, each showing that a run was made on a
single processor, instead of one set of output showing that it ran on
4 processors. If the 1-processor outputs occur, it means that Pypar
is not working correctly.
Also note that once you import the PyPar module, Pypar initializes MPI
for you, and you can use MPI calls directly in your Python script, as
described in the Pypar documentation. The last line of your Python
script should be pypar.finalize(), to insure MPI is shut down
correctly.
Note that any Python script (not just for LAMMPS) can be invoked in
one of several ways:
% python foo.script
% python -i foo.script
% foo.script :pre
The last command requires that the first line of the script be
something like this:
#!/usr/local/bin/python
#!/usr/local/bin/python -i :pre
where the path points to where you have Python installed, and that you
have made the script file executable:
% chmod +x foo.script :pre
Without the "-i" flag, Python will exit when the script finishes.
With the "-i" flag, you will be left in the Python interpreter when
the script finishes, so you can type subsequent commands. As
mentioned above, you can only run Python interactively when running
Python on a single processor, not in parallel.
:line
:line
11.5 Using LAMMPS from Python :link(py_5),h4
The Python interface to LAMMPS consists of a Python "lammps" module,
the source code for which is in python/lammps.py, which creates a
"lammps" object, with a set of methods that can be invoked on that
object. The sample Python code below assumes you have first imported
the "lammps" module in your Python script. You can also include its
settings as follows, which are useful in test return values from some
of the methods described below:
from lammps import lammps
from lammps import LMPINT as INT
from lammps import LMPDOUBLE as DOUBLE
from lammps import LMPIPTR as IPTR
from lammps import LMPDPTR as DPTR
from lammps import LMPDPTRPTR as DPTRPTR :pre
These are the methods defined by the lammps module. If you look
at the file src/library.cpp you will see that they correspond
one-to-one with calls you can make to the LAMMPS library from a C++ or
C or Fortran program.
lmp = lammps() # create a LAMMPS object using the default liblmp.so library
lmp = lammps("g++") # create a LAMMPS object using the liblmp_g++.so library
lmp = lammps("",list) # ditto, with command-line args, list = \["-echo","screen"\]
lmp = lammps("g++",list) :pre
lmp.close() # destroy a LAMMPS object :pre
lmp.file(file) # run an entire input script, file = "in.lj"
lmp.command(cmd) # invoke a single LAMMPS command, cmd = "run 100" :pre
xlo = lmp.extract_global(name,type) # extract a global quantity
# name = "boxxlo", "nlocal", etc
# type = INT or DOUBLE :pre
coords = lmp.extract_atom(name,type) # extract a per-atom quantity
# name = "x", "type", etc
# type = IPTR or DPTR or DPTRPTR :pre
eng = lmp.extract_compute(id,style,type) # extract value(s) from a compute
v3 = lmp.extract_fix(id,style,type,i,j) # extract value(s) from a fix
# id = ID of compute or fix
# style = 0 = global data
# 1 = per-atom data
# 2 = local data
# type = 0 = scalar
# 1 = vector
# 2 = array
# i,j = indices of value in global vector or array :pre
var = lmp.extract_variable(name,group,flag) # extract value(s) from a variable
# name = name of variable
# group = group ID (ignored for equal-style variables)
# flag = 0 = equal-style variable
# 1 = atom-style variable :pre
natoms = lmp.get_natoms() # total # of atoms as int
x = lmp.get_coords() # return coords of all atoms in x
lmp.put_coords(x) # set all atom coords via x :pre
:line
IMPORTANT NOTE: Currently, the creation of a LAMMPS object does not
take an MPI communicator as an argument. There should be a way to do
this, so that the LAMMPS instance runs on a subset of processors if
desired, but I don't know how to do it from Pypar. So for now, it
runs on MPI_COMM_WORLD, which is all the processors. If someone
figures out how to do this with one or more of the Python wrappers for
MPI, like Pypar, please let us know and we will amend these doc pages.
Note that you can create multiple LAMMPS objects in your Python
script, and coordinate and run multiple simulations, e.g.
from lammps import lammps
lmp1 = lammps()
lmp2 = lammps()
lmp1.file("in.file1")
lmp2.file("in.file2") :pre
The file() and command() methods allow an input script or single
commands to be invoked.
The extract_global(), extract_atom(), extract_compute(),
extract_fix(), and extract_variable() methods return values or
pointers to data structures internal to LAMMPS.
For extract_global() see the src/library.cpp file for the list of
valid names. New names could easily be added. A double or integer is
returned. You need to specify the appropriate data type via the type
argument.
For extract_atom(), a pointer to internal LAMMPS atom-based data is
returned, which you can use via normal Python subscripting. See the
extract() method in the src/atom.cpp file for a list of valid names.
Again, new names could easily be added. A pointer to a vector of
doubles or integers, or a pointer to an array of doubles (double **)
is returned. You need to specify the appropriate data type via the
type argument.
For extract_compute() and extract_fix(), the global, per-atom, or
local data calulated by the compute or fix can be accessed. What is
returned depends on whether the compute or fix calculates a scalar or
vector or array. For a scalar, a single double value is returned. If
the compute or fix calculates a vector or array, a pointer to the
internal LAMMPS data is returned, which you can use via normal Python
subscripting. The one exception is that for a fix that calculates a
global vector or array, a single double value from the vector or array
is returned, indexed by I (vector) or I and J (array). I,J are
zero-based indices. The I,J arguments can be left out if not needed.
See "Section_howto 15"_Section_howto.html#howto_15 of the manual for a
discussion of global, per-atom, and local data, and of scalar, vector,
and array data types. See the doc pages for individual
"computes"_compute.html and "fixes"_fix.html for a description of what
they calculate and store.
For extract_variable(), an "equal-style or atom-style
variable"_variable.html is evaluated and its result returned.
For equal-style variables a single double value is returned and the
group argument is ignored. For atom-style variables, a vector of
doubles is returned, one value per atom, which you can use via normal
Python subscripting. The values will be zero for atoms not in the
specified group.
The get_natoms() method returns the total number of atoms in the
simulation, as an int. Note that extract_global("natoms") returns the
same value, but as a double, which is the way LAMMPS stores it to
allow for systems with more atoms than can be stored in an int (> 2
billion).
The get_coords() method returns an ctypes vector of doubles of length
3*natoms, for the coordinates of all the atoms in the simulation,
ordered by x,y,z and then by atom ID (see code for put_coords()
below). The array can be used via normal Python subscripting. If
atom IDs are not consecutively ordered within LAMMPS, a None is
returned as indication of an error.
Note that the data structure get_coords() returns is different from
the data structure returned by extract_atom("x") in four ways. (1)
Get_coords() returns a vector which you index as x\[i\];
extract_atom() returns an array which you index as x\[i\]\[j\]. (2)
Get_coords() orders the atoms by atom ID while extract_atom() does
not. (3) Get_coords() returns a list of all atoms in the simulation;
extract_atoms() returns just the atoms local to each processor. (4)
Finally, the get_coords() data structure is a copy of the atom coords
stored internally in LAMMPS, whereas extract_atom returns an array
that points directly to the internal data. This means you can change
values inside LAMMPS from Python by assigning a new values to the
extract_atom() array. To do this with the get_atoms() vector, you
need to change values in the vector, then invoke the put_coords()
method.
The put_coords() method takes a vector of coordinates for all atoms in
the simulation, assumed to be ordered by x,y,z and then by atom ID,
and uses the values to overwrite the corresponding coordinates for
each atom inside LAMMPS. This requires LAMMPS to have its "map"
option enabled; see the "atom_modify"_atom_modify.html command for
details. If it is not or if atom IDs are not consecutively ordered,
no coordinates are reset,
The array of coordinates passed to put_coords() must be a ctypes
vector of doubles, allocated and initialized something like this:
from ctypes import *
natoms = lmp.get_atoms()
n3 = 3*natoms
x = (c_double*n3)()
x[0] = x coord of atom with ID 1
x[1] = y coord of atom with ID 1
x[2] = z coord of atom with ID 1
x[3] = x coord of atom with ID 2
...
x[n3-1] = z coord of atom with ID natoms
lmp.put_coords(x) :pre
Alternatively, you can just change values in the vector returned by
get_coords(), since it is a ctypes vector of doubles.
:line
As noted above, these Python class methods correspond one-to-one with
the functions in the LAMMPS library interface in src/library.cpp and
library.h. This means you can extend the Python wrapper via the
following steps:
Add a new interface function to src/library.cpp and
src/library.h. :ulb,l
Rebuild LAMMPS as a shared library. :l
Add a wrapper method to python/lammps.py for this interface
function. :l
You should now be able to invoke the new interface function from a
Python script. Isn't ctypes amazing? :l,ule
:line
:line
11.6 Example Python scripts that use LAMMPS :link(py_6),h4
These are the Python scripts included as demos in the python/examples
directory of the LAMMPS distribution, to illustrate the kinds of
things that are possible when Python wraps LAMMPS. If you create your
own scripts, send them to us and we can include them in the LAMMPS
distribution.
trivial.py, read/run a LAMMPS input script thru Python,
demo.py, invoke various LAMMPS library interface routines,
simple.py, mimic operation of couple/simple/simple.cpp in Python,
gui.py, GUI go/stop/temperature-slider to control LAMMPS,
plot.py, real-time temeperature plot with GnuPlot via Pizza.py,
viz_tool.py, real-time viz via some viz package,
vizplotgui_tool.py, combination of viz_tool.py and plot.py and gui.py :tb(c=2)
:line
For the viz_tool.py and vizplotgui_tool.py commands, replace "tool"
with "gl" or "atomeye" or "pymol" or "vmd", depending on what
visualization package you have installed.
Note that for GL, you need to be able to run the Pizza.py GL tool,
which is included in the pizza sub-directory. See the "Pizza.py doc
pages"_pizza for more info:
:link(pizza,http://www.sandia.gov/~sjplimp/pizza.html)
Note that for AtomEye, you need version 3, and there is a line in the
scripts that specifies the path and name of the executable. See the
AtomEye WWW pages "here"_atomeye or "here"_atomeye3 for more details:
http://mt.seas.upenn.edu/Archive/Graphics/A
http://mt.seas.upenn.edu/Archive/Graphics/A3/A3.html :pre
:link(atomeye,http://mt.seas.upenn.edu/Archive/Graphics/A)
:link(atomeye3,http://mt.seas.upenn.edu/Archive/Graphics/A3/A3.html)
The latter link is to AtomEye 3 which has the scriping
capability needed by these Python scripts.
Note that for PyMol, you need to have built and installed the
open-source version of PyMol in your Python, so that you can import it
from a Python script. See the PyMol WWW pages "here"_pymol or
"here"_pymolopen for more details:
http://www.pymol.org
http://sourceforge.net/scm/?type=svn&group_id=4546 :pre
:link(pymol,http://www.pymol.org)
:link(pymolopen,http://sourceforge.net/scm/?type=svn&group_id=4546)
The latter link is to the open-source version.
Note that for VMD, you need a fairly current version (1.8.7 works for
me) and there are some lines in the pizza/vmd.py script for 4 PIZZA
variables that have to match the VMD installation on your system.
:line
See the python/README file for instructions on how to run them and the
source code for individual scripts for comments about what they do.
Here are screenshots of the vizplotgui_tool.py script in action for
different visualization package options. Click to see larger images:
:image(JPG/screenshot_gl_small.jpg,JPG/screenshot_gl.jpg)
:image(JPG/screenshot_atomeye_small.jpg,JPG/screenshot_atomeye.jpg)
:image(JPG/screenshot_pymol_small.jpg,JPG/screenshot_pymol.jpg)
:image(JPG/screenshot_vmd_small.jpg,JPG/screenshot_vmd.jpg)