Accurate gait detection is crucial in utilizing the ample health information embedded in it. As an alternative of the laborious and demanding sensor-based detection, vision-based approaches emerged. Yet, its complicate feature engineering process and heavy reliance on the lateral views serve as challenges. This study aimed to propose a view-independent vision-based gait event detection using deep learning networks that requires no pre-processing. A total of 22 participants performed seven different walking and running-related actions and the sequential video frames acquired from their actions were used as the input of the deep learning networks to produce the probability of gait events as outputs. The Transformer network and ResNet18 trained with sequential video frames achieved an F1-score of 0.956 or higher for walking straight and walking around. The detection performance on the frontal, lateral, and backside views did not differ much. The findings enhance applicability of vision-based approach and contribute to increasing its utility in health monitoring.