General-purpose computing on graphics processing units (GPGPU) is used not only to offload the CPU from heavy computations but also to perform them faster than it is possible on CPUs. This is commonly referred as GPU acceleration and is an exercised area of study in the PC platform that has received very little attention on commodity embedded devices. Just as the PC GPU is being used to perform computations that would be impossible in terms of execution time for its accompanying CPU, the embedded GPU can accelerate computations normally done by the embedded CPU. This work presents an implementation of a factible, real-Time, GPU accelerated stereo matching solution using a Broadcom's VideoCore IV GPU (BCM2835 System on a Chip). Details include the delimitation process, design considerations and optimization techniques.