We present a revised version of the OptQC program of Loke et al. (2014) . We have removed the simulated annealing process in favour of a descending random walk. We have also introduced a new method for iteratively generating permutation matrices during the random walk process, providing a reduced total cost for implementing the quantum circuit. Lastly, we have also added a synchronization mechanism between threads, giving quicker convergence to more optimal solutions. New version program summary Program title: OptQC v1.3 Catalogue identifier: AEUA_v1_3 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEUA_v1_3.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 240903 No. of bytes in distributed program, including test data, etc.: 632395 Distribution format: tar.gz Programming language: Fortran, MPI. Computer: Any computer with Fortran compiler (not gfortran4.9 or earlier) and MPI library. Operating system: Linux. Classification: 4.15. Catalogue identifier of previous version: AEUA_v1_3 Journal reference of previous version: Comput. Phys. Comm. 185(2014)3307 External routines: Intel MKL LAPACK routines and MPI routines. Does the new version supersede the previous version?: Yes Nature of problem: It aims to minimize the number of quantum gates required to implement a given unitary operation. Solution method: It utilizes a descending random walk to select permutation matrices P and Q for a given unitary matrix U such that the number of gates in the quantum circuit of U=QTPTU'PQ is minimized, where U' is equivalent to U up to a permutation. The decomposition of a unitary operator is performed by recursively applying the cosine–sine decomposition. Reasons for new version: Simulated annealing process was found to give suboptimal results compared to a normal descending random walk. Computation time was also bloated by the necessity of running the CS decomposition thrice (for U',P and PT) for each iteration of the optimization process. Summary of revisions: • Simulated annealing process was replaced by a descending random walk (equivalent to setting the threshold value β to 0), due to poor convergence of the simulated annealing process, and due to the lack of pathological instances of local minima in this particular search space.• Introduced an iterative method for generating the general permutation matrix P, in which we start with P=I and build up the quantum circuit (and modify P correspondingly) by adding gates onto the existing circuit. This removes the requirement of running the CS decomposition on P and PT to find the quantum circuit implementation, since it is already known by construction.• Added a synchronization mechanism between threads (after some prescribed number of iterations) in which the current state of the top 10% processes with the fittest solutions is copied over to the remaining 90%. This works so as to discard the less fit solutions and focuses the searching algorithm in the state space with the fittest solutions.Additional comments: The program contains some Fortran2003 features and will not compile with gcc4.9 or earlier. Running time: As before, running time increases with the size of the unitary matrix, as well as the prescribed maximum number of iterations for qubit permutation selection and the descending random walk. All simulation results presented in this paper are obtained from running the program on the Fornax supercomputer managed by iVEC@UWA with Intel Xeon X5650 CPUs. A comparison of running times is also given in Table 1. References:  T. Loke, J.B. Wang, Y.H. Chen, Comput. Phys. Comm. 185 (2014) 3307.